Metamodeling for Large-Scale Optimization Tasks Based on Object Networks

Metamodeling for Large-Scale Optimization Tasks Based on Object Networks Ludmilla Werbos, Robert Kozma, Rodrigo Silva-Lugo, Giovanni E. Pazienza, and Paul Werbos Abstract—Optimization in large-scale networks – such as expensive models or computations. In this paper, we large logistical networks and electric power grids involving summarize the theory of inverse metamodeling and apply it many thousands of variables – is a very challenging task. In this to a practical problem, related to large-scale optimization paper, we present the theoretical basis and the related tasks. experiments involving the development and use of visualization The paper is structured as follows: in Sec. II, we introduce tools and improvements in existing best practices in managing the theoretical background of metamodeling; in Sec. III, we optimization software, as preparation for the use of present a practical problem and the numerical simulations “metamodeling” – the insertion of complex neural networks or other universal nonlinear function approximators into key obtained by using the principles of metamodeling; in Sec. parts of these complicated and expensive computations; this IV, we discuss the importance of using proper tools to novel approach has been developed by the new Center for represent the knowledge obtained from our simulations; in Large-Scale Integrated Optimization and Networks (CLION) Sec. V, we draw the conclusions and discuss the future at University of Memphis, TN. developments of our project. II. PRINCIPLES OF METAMODELING I. INTRODUCTION A. Forward Metamodeling HIS paper presents a combined method of solving and In the world of stochastic optimization and operations Tunderstanding scheduling and optimization tasks using research, “metamodeling” usually means forward both Linear Programming and Neural Networks. The metamodeling. In forward metamodeling, a neural network problem of transportation network configuration and fleet or some other function approximator is trained to assignment in realistic case is multidimensional (hundreds of approximate the function we are trying to minimize. We can thousands variables and constraints) and computationally represent a neural network as a function, f(x,W), where x are challenging. The best way to approach it at present time is the inputs to the network, and W are the weights or Mixed Integer Programming (MIP) [1]. This approach led to parameters of the network. In the case of forwards great success stories, such as [2], [3], especially recently metamodeling, f is just a scalar, so we may write the neural with arrival of Gurobi software [4] and faster multicore network as f(x,W). The task is to minimize the cost function computers. C(u, α) subject to some constraints, where: However, the more we succeed, the more we want. The • C is the total cost scope of real-life problems is growing faster than software • α is the set of all the inputs, including those in the and hardware capabilities. If a few years ago we would be constraints satisfied to be able to predict a typical day for • u is the set of all the things we are trying to optimize, manufacturing, distribution or logistics optimization including the fleet assignments purposes, now we want to see an image of a typical weekend There are three steps to forward metamodeling here: or a week. That would give the planners more power to 1. Pick a sample {ui} of possible fleet assignments ui – consider all the opportunities and their combinations on the either all based on the same input vector α, or with larger scale, instead of limiting themselves to a limited different α. Let us assume that we have m sample duration chunks of time modeled independently. points; in other words, i = 1 to m. Because the goal here is to minimize the amount of time 2. Calculate C(ui, α) for each of these assignments required to get to a desired level of accuracy, the best 3. Train the neural network f(x,W) to approximate C approach will include some use of metamodeling – the use over this sample; in other words, find the weights of fast universal approximators to approximate slower, more W which minimize the error between f and C, for our sample of m cases, where x(i) is set to ui or to ui||αi . Another way to say this is that we train the Manuscript submitted February 10, 2011. This work has been supported neutral network f over the set of sample pairs { <u , in part by FedEX Institute of Technology FIT Research Grant. i Ludmilla Werbos is with IntControl LLC and CLION, The University of C(ui, α)> }. Memphis. Forward metamodeling is especially useful for stochastic R. Kozma, R. Silva-Lugo and G.E. Pazienza are with CLION, The optimization problems where it is very expensive to University of Memphis. G.E. Pazienza is also with the Pazmany University, calculate the function C, or where the only way to get its Budapest ([email protected]). Paul Werbos is with CLION, The University of Memphis, and the value is to do a physical experiment. But in the practical National Science Foundation (NSF). logistics problems we have been looking at, it is generally what makes the mammal brain different from and more very easy to calculate the cost C after u and α are already powerful than the reptile brain. We may or may not be ready known. Therefore, the usual forwards metamodeling is not to get into this kind of advanced capability, but it certainly is so useful here. fundamental and important. 3) Robust optimization: Most of the US power grid is now B. Inverse Metamodeling managed by large Independent System Operators (ISOs) and The term “inverse metamodeling” is fairly new. The term Regional Transmission Organizations (RTO), who use large itself is due to Russell Barton of Penn State [5], but it is optimization packages to make decisions ranging from possible to generalize the idea, to include the following kind commitment of generation a day ahead, to actual dispatch of of inverse metamodeling, described in [6]: generation 15 minutes ahead, to planning many years ahead. 1. Pick a sample of possible α, {αi}, with i = 1 to m. The optimization problems now solved by these systems [8], 2. Run the optimizer to calculate the optimal u, u*(α), for are not so different from those in logistics. When the utilities each α, subject to the constraints. shift from day-ahead commitment to actual operations, they 3. Train the neural network f(x,W) to approximate the always have to make some (costly) adjustments, because of function u*(α) over the training set {<αi,u*(αi)>}. In other things which cannot be predicted ahead of time. Therefore, words, pick W so as to minimize the error between f and u*, they can get better results by reformulating the optimization when α is used as the input to the neural network. problem itself, by injecting noise. If one accounts for If the neural network were an exact approximator, and uncertainties in the vector u, the problem becomes more extremely fast, this would be extremely useful in saving complicated in theory – but the actual surface of expected computation time; however, it is not likely to be quite so error becomes more convex, which would typically increase good in the beginning. It will require a very complex neural the value of warm starts and possibly allow better use of new network to get decent approximation in this application. This methods. kind of research is the priority area for CLION, which specializes in complex neural networks. If the neural III. OPTIMIZATION OF SCHEDULING FOR LARGE-SCALE network is designed to match the capabilities of the LOGISTICS TASKS dedicated hardware implementation, as much as possible, it should be very fast in real time. Even if the approximation is A. Efficient Scheduling Process not perfect, this may provide a fast warm start for the Efficient scheduling is a crucial issue in many areas optimizer program. The value of warm starts varies a lot belonging to both defense and civilian domains, including from problem to problem, but generally provides a faster disaster response, power distribution, transportation, convergence compared to random or randomized starting set manufacturing job scheduling etc. [9-11]. Throughout the of data. years, there have been countless efforts to tackle this C. Variations and Extensions problem; nevertheless, the appearance on the market of 1) Gradient-assisted learning (GAL): With forwards manycore platforms and advanced FPGAs offers unexplored possibilities, which requires novel and more efficient metamodeling, we can usually calculate ∇uC for each approaches. The ultimate goal is to develop efficient sample point u , whenever we have a differentiable i nonlinear optimizers, like approximate dynamic algorithm to calculate C itself, at a computational cost about programming [12], though we are far away from this goal: the same as the computation of C itself. This comes from the today's state of art optimization tools explore mixed integer use of the chain rule for ordered derivatives, see [7]. programming techniques. We can use Gradient Assisted Learning to train the neural In the present work, we used the Gurobi software to network to match not only f and C, but f and C as well. ∇ ∇ explore and understand better the problem of large scale This can be extended to inverse metamodeling as well. If the optimization tasks, which may include hundreds of vector u has, say, 200 components, GAL causes a sample of thousands of variables. In particular, we focused on the N cases to be “worth” 200 times as much – as if 200×N decision support to optimize the assignment of resources for examples had been collected. For inverse metamodeling of the domestic market operation of a large distribution logistics tasks, it is necessary to get sensitivity information company, where in the logistics operation there are about from the existing optimizer to make this possible, and build 200 delivery points served by a fleet.

Load more