A Toolchain for Solving Dynamic Optimization Problems Using Symbolic and Parallel Computing

Evgeny Lazutkin Siegbert Hopfgarten Abebe Geletu Pu Li

Group Simulation and Optimal Processes, Institute for Automation and Systems Engineering, Technische Universität Ilmenau, P.O. Box 10 05 65, 98684 Ilmenau, Germany. {evgeny.lazutkin,siegbert.hopfgarten,abebe.geletu,pu.li}@tu-ilmenau.de

Abstract shown in Fig. 1. Based on the current process state x(k) obtained through the state observer or measurement, Significant progresses in developing approaches to dy- resp., the problem is solved in the opti- namic optimization have been made. However, its prac- mizer in each sample time. The resulting optimal control tical implementation poses a difficult task and its real- strategy in the first interval u(k) of the moving horizon time application such as in nonlinear model predictive is then realized through the local control system. There- control (NMPC) remains challenging. A toolchain is de- fore, an essential limitation of applying NMPC is due to veloped in this work to relieve the implementation bur- its long computation time taken to solve the NLP prob- den and, meanwhile, to speed up the computations for lem for each sample time, especially for the control of solving the dynamic optimization problem. To achieve fast systems (Wang and Boyd, 2010). In general, the these targets, symbolic computing is utilized for calcu- computation time should be much less than the sample lating the first and second order sensitivities on the one time of the NMPC scheme (Schäfer et al., 2007). Al- hand and parallel computing is used for separately ac- though powerful methods are available, e.g. multiple- complishing the computations for the individual time in- shooting (Houska et al., 2011; Kirches et al., 2012) tervals on the other hand. Two optimal control problems and collocation on finite elements (Biegler et al., 2002; are solved to demonstrate the efficiency of the developed Zavala et al., 2008; Word et al., 2014) with simultane- toolchain which solves one of the problems with approx- ous characteristics, control parametrization (Balsa-Canto imately 25,000 variables within a reasonable CPU time. et al., 2000; Barz et al., 2012) with sequential character- Keywords: nonlinear optimization, combined multiple istics, and quasi-sequential technique, (Hong et al., 2006; shooting and collocation, symbolic manipulation, paral- Bartl et al., 2011), the computation speed is not swift lel computing, satellite problem, combined cycle power enough for very fast systems such as mechanical, elec- plant trical and mechatronic systems. Therefore, it is highly desired to further enhance the computation efficiency for solving nonlinear dynamic optimization problems. 1 Introduction The combined multiple-shooting with collocation (CMSC) method (Tamimi and Li, 2010) and the modified Over the last decades nonlinear model predictive control multiple-shooting and collocation (MCMSC) method (NMPC) has been increasingly popular for the control of (Lazutkin et al., 2014) are proved to be highly efficient. complex systems (Mayne, 2014). To carry out NMPC, The efficiency of this method is considerably improved the first step is to formulate a nonlinear optimal control in this work with the following targets: problem. By using a discretization scheme over a pre- • to reduce the computation time by using symbolic diction horizon, it is then transformed into a constrained methods for calculating gradients, Jacobians, and nonlinear programming (NLP) problem. Finally, the re- Hessians, alization of NMPC is made by repeatedly solving this problem online with an NLP solver which requires ap- • to further accelerate the computation by using par- propriate function values and gradients. Although many allel computing facilities, especially for real-time theoretical progresses on NMPC have been achieved, its applications. implementation for real-life applications is certainly not To achieve these aims, this work develops a toolchain trivial. Therefore, a toolchain is developed in this work as described in subsequent sections. In section 2 based on open-source software tools to relieve the bur- the problem will be formulated. Section 3 illustrates dens in the implementation of NMPC. the interior-point solution method with symbolic com- A schematic description of implementing NMPC is putations of first- and second-order . The

DOI Proceedings of the 11th International Modelica Conference 311 10.3384/ecp15118311 September 21-23, 2015, Versailles, France A Toolchain for Solving Dynamic Optimization Problems Using Symbolic and Parallel Computing toolchain, its components, functionality, and some code examples are presented in section 4. The efficiency of Optimizer the toolchain is demonstrated in section 5 by applying it Solving optimal to a satellite control and a large-scale dynamic optimiza- u(k) control problem x(k) tion of a combined cycle power plant. Conclusions of the paper are given in section 6. Local State control system observer 2 Problem description

The nonlinear optimal control problem (NOCP) reads Process

t f min J = M x(t f ),t f + L(x(t),u(t),t)dt u(t)  Zt0   Figure 1. Nonlinear model predictive control (NMPC) scheme s. t. f (x˙(t),x(t),u(t),t)= 0 , t0 ≤ t ≤ t f , x(t0)= x0, x(t f ) fixed or free, g(x(t),u(t),t) ≤ 0, (1) Due to the combined character of the approach to be ap- plied the model equations are not directly integrated in xmin ≤ x(t) ≤ xmax, the NLP formulation (2) but solved to obtain the state umin ≤ u(t) ≤ umax, variables x p,i+1 by a collocation scheme. For detailed description of the NOCP and its transformation to an with t ∈ [t ,t ] - time, t ,t - initial, final time, x(t) ∈ 0 f 0 f NLP refer to (Tamimi and Li, 2010, 2009; Lazutkin et al., Rnx - state variable vector, u(t) ∈ Rnu - control vari- 2014). able vector, x0 - initial state vector, x f - final state vec- tor, f ∈ Rnx+nu → Rnx - implicit differential equation, J - performance index with Mayer and Lagrange term M : Rnx+1 → R and L : Rnx+nu → R, resp., belonging 3 Solution method to corresponding function spaces, g - additional equal- 3.1 Solution of the resulting NLP problem ity and/or inequality constraints, xmin,xmax,umin,umax,- componentwise lower and upper bounds for states x(t) For simplicity reasons the NLP problem (2) is rewritten and controls u(t), resp. in the compact form Envisaging the application of the modified combined multiple shooting and collocation (MCMSC) method, a min{J(ω)} transformation of the infinite-dimensional NOCP (1) to a ω finite-dimensional nonlinear programming (NLP) prob- s. t. E(ω)= 0 , (3) lem is needed. Due to the multiple-shooting technique a S(ω) ≤ 0 , division of the whole time horizon [t0,t f ] into N time in- tervals, so called shooting intervals has to be performed. where ω contains all optimization variables, E all equal- The controls are assumed to be constant in each shooting ity, and S all inequality constraints. interval and are parametrized, i.e. the control vector is The NLP (4) can be solved using an interior-point T composed as V =[v0 v1 ... vN−1] , nu = nv. The states optimization solver (e. g. Ipopt (Wächter and Biegler, are also discretized and parametrized at the shooting in- p 2006)). Hence, the barrier function formulation of the terval boundaries, i.e. the vector X =[x p,0 x p,1 ... x p,N] NLP reads is constructed and equality constraints for continuity rea- sons are taken into account. All other restrictions are nS correspondingly discretized. min J(ω) − µ ∑ ln(z j) ω,z This leads to the NLP notation ( j=1 ) s. t. E(ω)= 0, (4) N−1 ti+1 S(ω)+ z = 0, min M (x p,N)+ ∑ L(x(t),vi)dt X p,V ( i=0 Zti ) with a slack variable z and the corresponding Lagrange s. t. x = x (t ;x ,v ), i = 0,..., N − 1, p,i+1 i+1 p,i i function x p,0 = x0, g¯(X p,V ) ≤ 0, (2) nS L ω,z,λ J ω z λ E T E ω p (ω λ ) = (ω) − µ ∑ ln( j)+(λ ) (ω) x¯min ≤ X ≤ x¯max j=1 u¯ min ≤ V ≤ u¯ max . +(λ S)T (S(ω)+ z) (5)

312 Proceedings of the 11th International Modelica Conference DOI September 21-23, 2015, Versailles, France 10.3384/ecp15118311 Session 4A: Optimization Applications and Methods for a fixed value of the barrier parameter µ. The vector The of the collocation polynomial reads T E S λ λ T , λ T RnE +nS λ = [(λ ) (λ ) ] ∈ contains multipliers asso- nc dx(t) dl j(t) c ciated with the nE + nS equality constraints. The iteration = ∑ x (10) dt dt j scheme of the interior-point algorithm is j=1   b nc t −tk ω l+1 = ω l + αl · ∆ω l, zl+1 = ω l + αl · ∆zl, with l j(t) = ∏ k=1,k6= j t j −tk λ l+1 = λ l + αl · ∆λ l, l = 0,1,... (6) The parametrized states can be put into a vector X p,q E S q The different search directions ∆ω l, ∆zl, ∆λ l , ∆λ l are and the controls into a vector V , where q = 1,2,...,N. obtained applying the Newton method to the KKT opti- The components of the vectors X p,q and V q are included mality conditions of problem (4). Let Z = diag(z) and in the decision variables X p and V , respectively, in the eT =(1,...,1) ∈ RnS . Thus, the following system needs NLP formulation. Furthermore, the vector X c,q repre- to be solved at each iteration step sents all collocation coefficients in the shooting interval q. Hence, the discretized nonlinear differential equation ∆ω l ∇ω L (ω l,zl,λ l) system results in the nonlinear algebraic equation system S ∆zl T −1 −µe Zl +λ l q c,q p,q K ·  E  = −  , where (7) G = W˙ · X +W˙ 0 · X ∆λ l E(ω l) ∆λ S   S(ω )+ z  (t f −t0) c,q q  l   l l  − · F(X ,V ) = 0 (11)     N ∇ωωωω L (ω l,zl,λ l) 0 ∇E(ω l) ∇S(ω l) with q = 1,2,...,N, q - index of shooting interval, N - −2 F f W˙ 0 −µZl 0 InS number of shooting intervals, F - discretized f , W and K =  T  ˙ ∇E(ω l) 0 0 0 W 0 - derivative matrices of the Lagrange polynomials. T  ∇S(ω l) In 0 0  At each iteration step of the optimization procedure,  S  p,q q   for given values X and V , (11) is solved by a New- According to (7) the gradient ∇J(ωl), Jacobians ton method. The results will be X c,q as well as the first- 2 ∇E(ω l), ∇S(ω l), and Hessian matrices ∇ J(ω l), and second-order sensitivities These results are utilized 2 2 ∇ E i(ω l), i = 1,...,nE , ∇ S j(ω l), j = 1,...,nS are re- in the SQP solver for calculation of the functions E, S, quired and made available either analytically or approxi- the Jacobians ∇E, ∇S, and the Hessian H. mately to the optimization solver. In the subsequent dis- Neglecting the shooting interval index q and writing cussions the expression (11) in a compressed form delivers

c p p H = ∇ωωωω L (ω l,zl,λ l) G(X (X ,V ),X ,V )= 0 . (12) nE nS 2 E 2 S 2 To obtain the first-order sensitivities, Eq. (12) has im- = ∇ J(ω)+ ∑λ i ∇ E i(ω)+ ∑ λ j ∇ S j(ω) (8) i=1 j=1 plicitely to be differentiated and provides G X c G will be referred as the analytic Hessian (AH). ∂G ∂X ∂G c T − T = 0 . (13) ∂X ∂ X pT V T ∂ X pT V T 3.2 First- and second-order sensitivities Eq. (13) represents a linear equation system. Typically, ∂G ∂G c and T have sparsity structures that can be A collocation method is used in each shooting interval ∂X ∂[X pT V T ] within the framework of the MCMSC approach. The ∂X c exploited in determination of T . states are approximated inside each shooting interval by ∂[X pT V T ] a linear combination of the Lagrange polynomials (9) us- Using (13), analytic expressions are derived in order ing a shifted Legendre collocation scheme, to calculate the second-order sensitivities. Hence, this equation is re-written here in the following compact form nc nc t t c − k c c p p ∂X p x(t)= ∑ ∏ · x j , (9) Φ(X (X ,V ),X ,V , (X ,V )) = 0 . (14) t −t pT T T j=1 k=1,k6= j j k ! ∂ X V

b c with x(t) - a polynomial approximation of a single state ∂X   The derivative pT T T indicates the dependencies of variable, xc - the unknown collocation-coefficient at the ∂[X V ] j the first-order sensitivities on the decision variables X p j-th collocationb point, {t1,...,tnc } - collocation points in ∂ and V . Applying the differentiation operator T the shooting interval [tq,tq+1]. A shifted scheme means, ∂[X pT V T ] that the last collocation point tnc is shifted to the right to (14) the equations for second-order sensitivities will interval border – a necessity for continuity reasons – and be available in matrix form and can be computed using the other ones are also shifted accordingly. LU decomposition.

DOI Proceedings of the 11th International Modelica Conference 313 10.3384/ecp15118311 September 21-23, 2015, Versailles, France A Toolchain for Solving Dynamic Optimization Problems Using Symbolic and Parallel Computing

4 Toolchain Modelica Objective

Libraries Model As mentioned above in order to offer a comfortable and user-friendly way of modeling and optimizing physical Optimica Constraints systems, to unburden the user from automatable tasks like gradients and sensitivities calculations, to acceler- Optimization model ate the implementation time as well as the computa- JModelica.org tion time for solving the optimization problem, the im- Compiler plementation of the MCMSC method within an open- source toolchain is proposed. This toolchain consists of Symbolic Optimization model amongst others physically-based object-oriented model- CasADi ing, using the modeling language Modelica with the ex- Objective Transformed symbolic tension Optimica, and the large scale nonlinear optimiza- Constraints optimization model tion solver Ipopt.

Evaluation Ipopt Jacobians Sensitivities 4.1 Components analytic Gradients numeric Hessian Ca sADi Compared with block-oriented modeling, the object- Multi-processing Ca sADi BFGS oriented modeling approach provides a more comfort- Newton method able and flexible alternative for physical systems. As a Collocation coefficients result this work uses the modeling language Modelica Optimization (Fritzson, 2014). Not only for simulation purposes but results also for the formulation of different optimization tasks the platform JModelica.org is utilized. Besides the stan- dard Modelica features JModelica.org contains the ex- Figure 2. Parallelized MCMSC toolchain tension Optimica (Åkesson, 2008) including the possi- bility to formulate optimal control problems and other optimization tasks (Åkesson et al., 2010). ery iteration only the numerical values of the variables One essential tool utilized in many respects is the au- mattering have to be updated. It is also possible that nu- tomatic differentiation tool CasADi (Andersson et al., merical Hessians approximated through a BFGS formula 2011, 2012a,b). Hence, it is applied for the calculation which usually incurs more computational effort. of first- and second-order derivatives, symbolic manipu- The MCMSC framework consists of three main parts, lations of the objective and the constraints. i.e. the optimizer, the calculation of state trajectories by The standard multi-processing Python module (Hell- means of a Newton method, and the sensitivity computa- mann, 2011) is a next module within the toolchain to tions. Ideally, both the Newton method and the sensitiv- perform parallel computations. ity calculation are recommended to be executed in par- After adopting the optimization model to the MCMSC allel in each shooting interval. Depending on the com- framework, Ipopt (Wächter and Biegler, 2006) is respon- puter architecture (multi-core, etc.) the user can define sible for the solution of large scale nonlinear optimiza- the number of processes. On the one hand, one gains tion problems. The interoperation of the toolchain within computing time improvements via parallelization. On the optimizer (see Fig. 1) is illustrated in Fig. 2. the other hand, the communication effort increases, the more parallel threads occur. There is a maximum speed up depending on the size of the optimization problem and 4.2 Functionality the computer architecture. After establishing an optimization model and imple- Concerning the first-order sensitivities computations menting it by means of Modelica and the extension Opti- from (13), an LU decomposition and the direct solver for mica, the JModelica.org compiler transforms the model sparse matrices CSparse (Davis, 2006) are applied and into a symbolic one. From the transformed model, i.e., interfaced to CasADi. The second-order derivatives, if model equations, variables, etc. are accessible by the used, are transformed to a linear equation system and Python scripting language. also solved by CasADi. This symbolic tool is further- The proposed MCMSC approach belongs to the cate- more responsible for the generation of the Jacobians, the gory of quasi-sequential methods, i.e. the interior-point gradients of the objective function, and the symbolic ma- optimizer Ipopt solves in every iteration the state equa- nipulations of the objective functions and the constraints. tions (11) and calculates the sensitivities (13) for given The entire approach of the parallelized MCMSC parametrized states and controls of each shooting inter- method is realized in the Python scripting language using val. If the advantageous feature of an analytical Hessian standard multi-processing module without any additional symbolically calculated by CasADi is used, also in ev- software packages.

314 Proceedings of the 11th International Modelica Conference DOI September 21-23, 2015, Versailles, France 10.3384/ecp15118311 Session 4A: Optimization Applications and Methods

Table 1. Classes and functions

(C)lass/(F)unction name Functionality loading(object) auxiliary functions to prepare the optimization problem for Ipopt initialization loads options and uses JModelica.org to extract the optimization problem from Modelica/Optimica. extract prepares the optimization problem define_collocation constructs Lagrange polynomials and derivative matrices construct_vars creates vectors of discretization discretize implicit DAE discretization create_solvers creates for each interval the Newton solver and solvers for first-order derivatives interval_simulation simulates the DAE system for given parameters constr_vec constraints vector for Ipopt jacobian_of_constraint_vector Jacobian for Ipopt create_cost objective for Ipopt prep_ipopt required information for Ipopt, e.g. number of non-zeros in Jacobians, boundaries prep_sys_for_multiproc divide the system according to the number of cores optimize solve the optimization problem by Ipopt exact_hessian construct exact Hessian and corresponding linear solvers scaling scale the problem

4.3 Source code examples DER = OP.getVariables(OP.DERIVATIVE) # Controls Different classes and functions shown in Tab. 1 are INPUT = OP.getVariables(OP.REAL_INPUT) realized dedicated to certain purposes. In particu- # Parameters lar, there are several important issues in the MCMSC P_I = OP.getVariables( ... toolchain. A brief description is given below with OP.REAL_PARAMETER_INDEPENDENT) some source code fragments. Using JModelica.org the P_D = OP.getVariables( ... formulated optimization problem can be easily trans- OP.REAL_PARAMETER_DEPENDENT) ferred for further manipulations using the function transfer_optimization_problem , which has In order to extract model dynamics, the OP variable has name two attributes: is an optimization problem file and specific function to get DAEs, which returns a residual file_path OP is a path to this file. Let be the sym- between left and right hand sides of the equations. bolic representation of the dynamic optimization prob- lem, which includes all required information. To see the DAE = OP.getDaeResidual() list of functions and methods for the OP variable, users have to refer to the JModelica.org source code files. For further manipulations with the extracted DAE, the The proposed toolchain requires a lot of functions symbolic function using CasADi has to be established. from CasADi. For simplification, functionality of To achieve this goal, all variables should be aggregated this software will be made completely available in the into an input vector. toolchain by importing all CasADi classes. The first essential aspect is to get information about MX_DAE = MXFunction(LIST_DER + ... the declared variables (differential and algebraic states, LIST_DIFF + LIST_ALG + ... controls, parameters) in the optimization problem for- LIST_INPUT + P_I + P_D, [DAE]) mulation. The following has been implemented by the MX_DAE.init() developer: Since JModelica.org works with the MX data type and the proposed toolchain accepts currently the SX data # Differential states type, the MXFunction can be converted to SXFunction: DIFF = OP.getVariables( ... OP.DIFFERENTIATED) SX_DAE = SXFunction(MX_DAE) # Algebraic states SX_DAE.init() ALG = OP.getVariables( ... OP.REAL_ALGEBRAIC) For further information about data types in CasADi # Derivatives refer to the manual.

DOI Proceedings of the 11th International Modelica Conference 315 10.3384/ecp15118311 September 21-23, 2015, Versailles, France A Toolchain for Solving Dynamic Optimization Problems Using Symbolic and Parallel Computing

The function SX_DAE should be called with two argu- "csparse") ments, the SX symbolic expression (converted from MX) solver.setOption("abstol",1e-12) and the scaled DAE system (with respect to the interval solver.init() length). # LU solver # NCP - number of collocation points SX_DAE_NUM = SX_DAE.call( ... SYSTEM_INDEX = (N_D + N_A)*NCP SX_INPUTS_DAE)[0] # Full Jacobian SX_DAE_FUNCTION = SXFunction( ... partial_jacobian=interval_dae.jac() [vertcat(SX_INPUTS_DAE)], ... dGdX_sym = partial_jacobian[ ... [LENGTH*SX_DAE_NUM]) range(SYSTEM_INDEX), ... SX_DAE_FUNCTION.init() range(SYSTEM_INDEX)] LHS = MX.sym(’LHS’, ... This function SX_DAE_FUNCTION is involved in the dGdX_sym.sparsity()) discretization procedure, since it declares the variable or- dGdXp_dU_sym = partial_jacobian[ ... der and accepts symbolical evaluation. range(SYSTEM_INDEX), ... For the discretization of the model equations, addi- SYSTEM_INDEX:SYSTEM_INDEX+N_D+N_C] tional variables should be introduced, RHS = MX.sym(’RHS’, ... dGdXp_dU_sym.sparsity()) # Piecewise control LUSolver = solve(LHS,RHS,"csparse") CTRL = SX.sym("c",N_INT*N_C) FRHS = MXFunction( ... # Parameterized states [LHS,RHS],[LUSolver]) P_S_P = SX.sym("ps",(N_INT+1)*(N_D)) FRHS.init() # Collocated differential # and algebraic states The optimization constraints vector and the corre- S_P = SX.sym("s", ... sponding Jacobian matrix, objective function and its gra- N_INT*((N_D + N_A)*NCP)) dient are also constructed by means of CasADi using symbolic manipulation. For the sake of brevity, these where N_INT is the number of shooting intervals de- details are not discussed here. fined by the user, N_D, N_A, and N_C are the numbers Corresponding to the number of the user-defined pro- of differential, algebraic, and control variables. cesses in the case of a multi-core CPU, shooting intervals For each shooting interval, certain variables are can be equally distributed between them. chosen from vectors CTRL, P_S_P, and S_P. The After problem initialization, the Ipopt instance is cre- SX_DAE_FUNCTION is called with these variables and ated to solve the optimization problem: the evaluation results are placed into RES variable. This procedure should be called for each interval. import pyipopt nlp = pyipopt.create(args) RES = SX_DAE_FUNCTION.call( ... x_opt = nlp.solve(initialGuess) [vertcat([der, diff, alg, ctrl, ... p_i_v, p_d_v])])[0] 5 Examples As mentioned before, the state trajectories and sensi- tivities have to be calculated for each interval. For this Showing the efficiency of the proposed approach a small- purpose, the toolchain uses Newton and LU solvers from scale and a large-scale problem are presented. All com- CasADi. putations are performed on a stand-alone personal com- puter with Intel R I7 4.4 GHz, 6 cores, 16 GB RAM with # Newton solver Ubuntu 14.04.1 Server x64 operational system. # interval_dae - discretized DAE # system for one interval 5.1 Satellite control problem # variables - states at collocation # points This nonlinear optimal control problem is devoted to the # parameters - parametrized states calculation of the optimal control and the concluding op- # and controls timal torques according to a Bolza functional that bring F = SXFunction( ... the satellite to rest after an initial tumbling motion. The [vertcat([variables]), ... problem is listed e.g. in (Rudquist and Edvall, 2009). vertcat([parameters])], ... The optimal states are shown in Figs. 3 and 4, where [vertcat([interval_dae])]) the state x8 represents the integrand of the Lagrange solver = ImplicitFunction("newton",F) term. The optimal controls are contained in Fig. 5. The solver.setOption("linear_solver", ... time horizon corresponds to 100 seconds and is divided

316 Proceedings of the 11th International Modelica Conference DOI September 21-23, 2015, Versailles, France 10.3384/ecp15118311 Session 4A: Optimization Applications and Methods

0.5 0.25

0.4 0.20 0.025 0.3 0.15 0.2 0.020 x2 x1 0.10 0.1 0.015 0.05 0.0 u1 0.010 0.1 0.00 0 10 20 30 40 50 60 0 10 20 30 40 50 60 0.005 0.10 1.00 0.98 0.000 0.08 0 10 20 30 40 50 60 0.96 0.00030 0.030 0.06 0.94 0.92 0.00025 0.025 0.04 x3 x4 0.90 0.00020 0.020 0.02 0.88 0.86 0.00015 0.015 u2 0.00 u3 0.84 0.00010 0.010 0.02 0.82 0 10 20 30 40 50 60 0 10 20 30 40 50 60 0.00005 0.005

0.00000 0.000 0 10 20 30 40 50 60 0 10 20 30 40 50 60 Figure 3. Optimal states x1(t) to x4(t)

Figure 5. Optimal controls u (t) to u (t) 0.0105 0.00505 1 3

0.0104 0.00500 0.00495 0.0103 0.00490 x5 x6 0.0102 0.00485

0.0101 0.00480 Table 2. Satellite control problem dimensions

0.0100 0.00475 0 10 20 30 40 50 60 0 10 20 30 40 50 60 Number of ... Value 0.0030 0.025 intervals 60 0.020 0.0025 non-zeros in Jacobians of eqs. 5,768 0.015 0.0020 non-zeros in Jacobians of ineqs. 0 x7 x8 0.010 non-zeros in Hessian of Lagrangian 3,960 0.0015 0.005 equality constraints 480

0.0010 0.000 inequality constraints 0 0 10 20 30 40 50 60 0 10 20 30 40 50 60 variables 668 variables incl. Newton variables 2,108 Figure 4. Optimal states x5(t) to x8(t) into 60 intervals. Figs. 3 - 5 show the correct results. The optimal objective value is in every scenario J∗ = 0.4639. Data being constant through all scenarios are listed in Table 3. Speed-up by parallelization and utilization of analytic Tab. 2. In case of the utilization of the analytical Hes- Hessian sians the number of non-zero elements in the Hessian of Hess. n It. tΣ [s] s s s the Lagrangian equals to 3,960. c 1 2 3 1 8 0.515 1.00 1.00 N/A Tab. 3 shows the speed-up in different scenarios. In BFGS this small-scale problem the effect of parallelization is 6 8 0.355 1.33 N/A 1.00 1 5 0.723 1.00 0.71 N/A not substantial, but the number of iterations can be re- AH duced and thus the computation time is less in most 6 5 0.462 1.56 N/A 0.77 cases. 1 5 0.346 1.00 1.49 N/A AHC In a first case, comparing the speed-up by paralleliza- 6 5 0.320 1.08 N/A 1.11 tion within the same computation scheme for the Hessian Hess.: BFGS - approximated Hessian, AH - analytic (column s1), i. e. considering the first and the second row Hessian (updated in every iteration of the optimizer), in the column s1 in BFGS, AH, and AHC case, resp., the AHC - analytic Hessian (calculated once in iteration 0); speed-up factors s1 reach from 1.08 (AHC), 1.33 (BFGS) nc - no. of CPU cores (nc = 1: no parallelization), It. to 1.56 (AH). - no. of iterations, tΣ - total CPU time (mean value of In the second case, comparisons are dedicated to the 100 runs), si, i = 1,2,3 - speed-up factors, N/A - not non-parallelized versions (column s2) contrasting the applicable Hessian calculation methods to each other. Unsurpris- ingly, no speed-up is achieved (AH/nc = 1/s2 = 0.71) due to the fact that the symbolic calculations take time to es-

DOI Proceedings of the 11th International Modelica Conference 317 10.3384/ecp15118311 September 21-23, 2015, Versailles, France A Toolchain for Solving Dynamic Optimization Problems Using Symbolic and Parallel Computing

tablish the analytic Hessian. But, calculating the Hessian 1.8 only once in the start iteration and leaving it constant 1.6 through the iteration process also delivers the correct 1.4 result with the risk that search direction could be non- 1.2 1.0 descent, and the speed-up amounts to 1.49 (AHC/nc = 0.8 Evaporator pressure 1/s2 = 1.49). 0.6 The third case considers the parallelized versions (col- 0 10 20 30 40 50 60 umn s3), also contrasting the Hessian calculation meth- 1.0 ods to each other. The computation with the BFGS ap- 0.8 proximation constitutes the reference. As to be expected 0.6 Load in this small-scale example, one gets also no speed-up 0.4 in the parallelized versions (AH/nc=4/s3 = 0.77) from 0.2 BFGS approximation to analytic Hessian calculation. 0.0 0 10 20 30 40 50 60 However, confronting BFGS with AHC, the speed-up ac- Time intervals counts to 1.11 (AHC/nc = 4/s3 = 1.11). Figure 6. Optimal pressure and load 5.2 Combined cycle power plant start-up Table 4. CCPP problem dimensions control problem Number of ... Value Another example, a combined cycle power plant (CCPP) was chosen for several reasons. Firstly, this is an exam- intervals 60 ple of interest in the liberalized energy market, because non-zeros in Jacobians of eqs. 7,210 classical power plants have to be adopted to the opera- non-zeros in Jacobians of ineqs. 104,940 tion of electrical energy supply networks with renewable non-zeros in Hessian of Lagrangian 3,960 energies. Thus, they are more often set into operation equality constraints 610 or shutdown than in the past. Secondly, it is a high- inequality constraints 9,540 dimensional problem compared with other academic ex- variables 670 amples. Thirdly, the example is used for the verification variables incl. Newton variables 24,790 of the achieved results. In (Casella and Pretolani, 2006) the system was introduced. The plant is composed of a gas turbine unit, heat recovery steam generator, a steam but using automatic differentiation by ADOL-C. turbine, and a condenser. The start-up time is limited The approach and toolchain discussed in our paper due to the following facts: a maximum load change rate uses a modified combined multiple shooting and col- of the gas turbine, the thermal stress in the thick com- location (MCMSC) method, CasADi for automatic dif- ponents (e.g. steam turbine shafts), and limited control ferentiation, JModelica.org for modeling and formula- variables. tion of the optimization problem by means of Optimica, Several authors used the object-oriented implemented and Ipopt as NLP solver. Thus, a direct comparison to model for the optimal control of the start-up process. In the contributions mentioned above is not possible due to (Casella et al., 2011a) a simplified model is used. The different models, approaches, time horizons, tools, and contribution (Casella et al., 2011b) reports on a solution computers used. Therefore, the only direct comparison using JModelica.org, CppAD for automatic differentia- between MCMSC and collocation method on finite ele- tion, and Ipopt for the NLP solution. The integration ments using JModelica.org is given in Tab. 5. of CasADi and JModelica.org is described in (Anders- Exemplarily, one of the essential optimal states (evap- son et al., 2011), where the CCPP system is also used orator pressure) and the control (normalized load) are as a benchmark system, but the solution is achieved by shown in Fig. 6 above and below, resp., over 60 time a direct collocation approach. An approach with Open- intervals corresponding to 4,000 seconds operation time, Modelica and an optimization language specification, indicating the right behavior. CasADi as the automatic differentiation tool, and dif- Data being constant through all scenarios are listed in ferent optimization methods including direct collocation Tab. 4. Using the analytical Hessians the number of non- and direct multiple shooting, is shown in (Shitahun et al., zero elements in the Hessian of the Lagrangian equals to 2013). A parallel multiple shooting and a collocation op- 3,960. Tab. 5 shows the acceleration of computation time timization, performed with OpenModelica, is explained in most cases. in (Bachmann et al., 2012). That paper discusses multi- Concerning this large-scale problem, the gain ple shooting, multiple collocation, and total collocation achieved by parallelization within the same computation methods using up to 8 cores of a multi-core CPU with method for the Hessian (column s1) is better than in the OpenMP support. In (Ruge et al., 2014) the authors out- small-scale problem above. Comparing the first and the line a toolchain including modeling with OpenModelica, second row in the column s1 in each case (BFGS, AH,

318 Proceedings of the 11th International Modelica Conference DOI September 21-23, 2015, Versailles, France 10.3384/ecp15118311 Session 4A: Optimization Applications and Methods

Table 5. Speed-up by parallelization and utilization of analytic iteration is typically higher if the analytic Hessian is Hessian – comparison with collocation approach used. The approach can most advantageously be applied MCMSC approach to large-scale optimization problems.

Hess. nc It. tO [s] tSN [s] s1 s2 s3 1 47 5.578 8.887 1.00 1.00 N/A BFGS 6 47 5.585 4.509 1.97 N/A 1.00 6 Summary and Conclusions 1 35 2.032 27.612 1.00 0.32 N/A AH An optimal control problem needs to be solved online 6 35 2.038 12.530 2.20 N/A 0.36 in NMPC. This poses a challenge in the implementation 1 34 1.044 7.112 1.00 1.25 N/A AHC of the numerical algorithms and the enhancement of the 6 34 1.066 5.114 1.39 N/A 0.88 computation efficiency for the dynamic optimization ap- Collocation approach proach. In this work, a toolchain for solving nonlinear Hess. nc It. tC [s] sCMP dynamic optimization problems is developed based on BFGS 1 54 11.446 1.13 the combined multiple shooting and collocation method. AH 1 60 7.892 0.54 The toolchain is implemented in open-source software AHC 1 77 6.299 1.02 and both the first and the second-order sensitivities are Hess.: BFGS - approximated Hessian, AH - ana- automatically computed. As a result, the user needs only lytic Hessian (updated in every iteration), AHC - to provide the defined optimal control problem for im- analytic Hessian (calculated once in iteration 0); nc - plementing NMPC. In addition, parallel computing is re- no. of CPU cores, It. - no. of iterations, tO - CPU time alized for performing the computations in the individual for optimization by Ipopt (not parallelizable), tSN - CPU time intervals, thus leading to a reasonable reduction of time for sensitivity calculation and Newton solver, tC the computation time. The results of two case studies - CPU time for pure collocation on finite elements; show the capability of the toolchain for efficiently solv- (all CPU times are averages of 100 runs), si, i = 1,2,3 ing small to large-scale dynamic optimization problems. - speed-up factors, sCMP - speed-up factor between In future, it is planned to offer a web-based optimiza- collocation (CM) and parallelized MCMSC method, tion service for solving nonlinear dynamic optimization N/A - not applicable problems using the proposed approach.

7 Acknowledgments and AHC, resp.), the speed-up factors s1 reach from 1.39 (AHC), 1.97 (BFGS) to 2.20 (AH) referred to tSN. The authors are grateful for the support of the ITEA2/ In the second case, the non-parallelized versions (col- EUREKA Cluster programme by the European Com- umn s2) are under consideration referring the Hessian mission (project no. 11004, Model Driven Physical Sys- calculation methods to each other. Here, a speed-up is tems Operation (MODRIO)) and for the financing by only achieved in the AHC case (AHC/nc = 1/s2 = 1.25). German Ministry of Education and Research (BMBF, The third comparison evaluates the parallelized ver- Förderkennzeichen: 01IS12022H). sions (column s3), again contrasting the Hessian calcu- lation methods to each other. Here, no speed-ups are achieved. Nevertheless, considering the optimization References time tO in the AH and AHC case compared with the BFGS case, it is significantly reduced by factors of 2.74 J. Åkesson. Optimica – an extension of Modelica supporting and 5.24, resp., because of conducive matrix structures. dynamic optimization. In Proc. 6th Int. Modelica Conf., pages 57–66. Modelica Association, March 3-4 2008. To have at least one comparison on the same com- puter of the MCMSC method with collocation method J. Åkesson, K.-E. Årzén, M. Gåfvert, T. Bergdahl, and (CM) on finite elements used in JModelica.org the lower H. Tummescheit. Modeling and optimization with Optimica part in Tab. 5 was added. In the parallelized version and JModelica.org-languages and tools for solving large- of the MCMSC method both the BFGS (BFGS/nc = scale dynamic optimization problems. Comput. Chem. Eng., 6/tO + tSN = 10.094) and the AHC scenario (AHC/nc = 34(11):1737–1749, 2010. 6/tO + tSN = 6.180) are faster than the non-parallelized version of the CM. J. Andersson, J. Åkesson, F. Casella, and M. Diehl. Integration The investigations and presented results show that the of CasADi and JModelica.org. In Proc. 8th Int. Modelica Conf., pages 218–231, 2011. doi:10.3384/ecp11063. presented parallelized MCMSC approach is a power- ful solution technique solving optimal control problems J. Andersson, J. Åkesson, and M. Diehl. Dynamic optimiza- within the proposed toolchain. The number of iterations tion with CasADi. In 51st IEEE Conference on Deci- are reduced compared with both the non-parallelized sion and Control, pages 681–686, 10-13 December 2012a. cases and the collocation approach, but the effort in one doi:10.1109/CDC.2012.6426534.

DOI Proceedings of the 11th International Modelica Conference 319 10.3384/ecp15118311 September 21-23, 2015, Versailles, France A Toolchain for Solving Dynamic Optimization Problems Using Symbolic and Parallel Computing

J. Andersson, J. Åkesson, and M. Diehl. CasADi: A symbolic E. Lazutkin, A. Geletu, S. Hopfgarten, and P. Li. package for automatic differentiation and optimal control. Modified multiple shooting combined with collocation Lect. Notes Comput. Sci. Eng., 87:297–307, 2012b. method in JModelica.org with symbolic calculations. In Proc. 10th Int. Modelica Conf., pages 999–1006, 2014. B. Bachmann, L. Ochel, V. Ruge, M. Gebremedhin, P. Fritz- doi:10.3384/ECP14096999. son, V. Nezhadali, L. Eriksson, and M. Sivertsson. Parallel multiple-shooting and collocation optimization with Open- D. Q. Mayne. Model predictive control: Recent developments Modelica. In Proc. 9th Int. Modelica Conf., pages 659–668, and future promise. Automatica, 50(12):2967–2986, 2014. 2012. P. E. Rudquist and M. M. Edvall. PROPT - Matlab Optimal E. Balsa-Canto, J. R. Banga, A. A. Alonso, and V. S. Vas- Control Software. User’s Guide. TOMLAB Optimization, siliadis. Efficient optimal control of bioprocesses using 2009. second-order information. Ind. Eng. Chem. Res., 39(11): 4287–4295, 2000. doi:10.1021/ie990658p. V. Ruge, W. Braun, B. Bachmann, A. Walther, and K. Kul- shreshtha. Efficient Implementation of Collocation Meth- M. Bartl, P. Li, and L. T. Biegler. Improvement of state profile ods for Optimization using OpenModelica and ADOL-C. accuracy in nonlinear dynamic optimization with the quasi- In Proc. 10th Int. Modelica Conf., pages 1017–1025, 2014. sequential approach. AIChE J., 57(8):2185–2197, 2011. doi:10.3384/ECP140961017. doi:10.1002/aic.12437. A. Schäfer, P. Kühl, M. Diehl, J. Schlöder, and H. G. Bock. T. Barz, R. Klaus, L. Zhu, G. Wozny, and H. Arellano-Garcia. Fast reduced multiple-shooting method for nonlinear model Generation of discrete first- and second-order sensitivities predictive control. Chem. Eng. Process., 46(11):1200– for single shooting. AIChE J., 58(10):3110–3122, 2012. 1214, 2007.

L. T. Biegler, A. M. Cervantes, and A. Wächter. Advances in A. Shitahun, V. Ruge, M. Gebremedhin, B. Bachmann, simultaneous strategies for dynamic process optimization. L. Eriksson, J. Andersson, M. Diehl, and P. Fritzson. Chem. Eng. Sci., 57(4):575–593, 2002. Model-Based Dynamic Optimization with OpenModel- ica and CasADi. In 7th IFAC Symp. on Advances in F. Casella and F. Pretolani. Fast Start-up of a Combined-Cycle Automotive Control, volume 1, pages 446–451, 2013. Power Plant: a Simulation Study with Modelica. In Proc. doi:10.3182/20130904-4-JP-2042.00166. 5th Modelica Conf., pages 3–10, 2006. J. Tamimi and P. Li. Nonlinear model predictive control F. Casella, F. Donida, and J. Åkesson. Object-Oriented Mod- using multiple shooting combined with collocation on fi- eling and Optimal Control: A Case Study in Power Plant nite elements. In 7th IFAC Int. Symp. on Advanced Start-Up. In Prepr. 18th IFAC World Congress. Milano. Control of Chemical Processes, pages 703–708, 2009. Italy, pages 9545–9554, 2011a. doi:10.3182/20090712-4-TR-2008.00114.

F. Casella, M. Farina, F. Righetti, R. Scattolini, D. Faille, J. Tamimi and P. Li. A combined approach to nonlinear model F. Davelaar, A. Tica, H. Gueguen, and D. Dumur. An predictive control of fast systems. J. Process Contr., 20(9): optimization procedure of the start-up of Combined Cycle 1092–1102, 2010. Power Plants. In Prepr. 18th IFAC World Congress. Milano. Italy, pages 7043–7048, 2011b. A. Wächter and L. T. Biegler. On the implementation of a primal-dual interior point filter line search algorithm for T. A. Davis. Direct methods for sparse linear systems. SIAM, large-scale nonlinear programming. Math. Program., 106 2006. (1):25–57, 2006.

P. Fritzson. Principles of object-oriented modeling and simula- Y. Wang and S. Boyd. Fast model predictive control using tion with Modelica 3.3: A cyber-physical approach. Wiley- online optimization. IEEE T. Contr. Syst. T., 18(2):267–278, IEEE Press, 2014. 2010.

D. Hellmann. The Python standard library by example. D. P. Word, J. Kang, J. Åkesson, and C. D. Laird. Efficient Addison-Wesley Professional, 1st edition edition, 2011. parallel solution of large-scale nonlinear dynamic optimiza- tion problems. Comput. Optim. Appl., 59(3):667–688, 2014. W. R. Hong, S. Q. Wang, P. Li, G. Wozny, and L. T. Biegler. doi:10.1007/s10589-014-9651-2. A quasi-sequential approach to large-scale dynamic op- timization problems. AIChE J., 52(1):255–268, 2006. V. M. Zavala, C. D. Laird, and L. T. Biegler. Interior-point de- doi:10.1002/aic.10625. composition approaches for parallel solution of large-scale nonlinear parameter estimation problems. Chem. Eng. Sci., B. Houska, H. J. Ferreau, and M. Diehl. An auto-generated 63(4834-4845):19, 2008. real-time iteration algorithm for nonlinear MPC in the mi- crosecond range. Automatica, 47(10):2279–2285, 2011.

C. Kirches, L. Wirsching, H. G. Bock, and J. P. Schlöder. Effi- cient direct multiple shooting for nonlinear model predictive control on long horizons. J. Process Contr., 22(3):540–551, 2012.

320 Proceedings of the 11th International Modelica Conference DOI September 21-23, 2015, Versailles, France 10.3384/ecp15118311