Optimal Control Problem Via Neural Networks

Neural Comput & Applic DOI 10.1007/s00521-012-1156-2 ORIGINAL ARTICLE Optimal control problem via neural networks Sohrab Effati • Morteza Pakdaman Received: 22 May 2012 / Accepted: 28 August 2012 Ó Springer-Verlag London Limited 2012 Abstract This paper attempts to propose a new method variety of practical problems arising in science and engi- based on capabilities of artificial neural networks, in neering that have a dynamical system and the dynamic of function approximation, to attain the solution of optimal the system must be controlled to attain an objective. In control problems. To do so, we try to approximate the recent years, several researchers attempted to offer and solution of Hamiltonian conditions based on the Pontryagin extend new methods for solving optimal control problem. minimum principle (PMP). For this purpose, we introduce For example, Krabs et al. [1] proposed a mathematical an error function that contains all PMP conditions. In the model for the control of the growth of tumor cells which is proposed error function, we used trial solutions for the formulated as a problem of optimal control theory. Mod- trajectory function, control function and the Lagrange ares et al. [2] presented a hybrid algorithm by integrating multipliers. These trial solutions are constructed by using an improved particle swarm optimization with successive neurons. Then, we minimize the error function that con- quadratic programming (SQP), for solving nonlinear opti- tains just the weights of the trial solutions. Substituting the mal control problems. optimal values of the weights in the trial solutions, we The solutions of optimal control problems can be calcu- obtain the optimal trajectory function, optimal control lated either by using Pontryagin’s minimum principle function and the optimal Lagrange multipliers. (PMP), which provides a necessary condition for optimality, or by solving the Hamilton–Jacobi–Bellman (HJB) partial Keywords Pontryagin minimum principle Á differential equation (PDE), which is a sufficient condition Optimal control problem Á Artificial neural networks (see e.g. [3, 4]). Solving the HJB–PDE is a very tedious task. Several approximation methods are proposed for solving it. Hilscher [5] considered Hamilton–Jacobi theory over time 1 Introduction scales and its applications to linear-quadratic problems. Based on the variational iteration method, Berkani et al. [6] A very important, extensive and applicable mathematical proposed a method for solving optimal control problems. model is the optimal control problem. There are a wide Garg et al. [7] presented a unified framework for the numerical solution of optimal control problems using col- location at Legendre–Gauss, Legendre–Gauss–Radau, and & S. Effati Á M. Pakdaman ( ) Legendre–Gauss–Lobatto points. An adaptive multilevel Department of Applied Mathematics, Ferdowsi University of Mashhad, Mashhad, Iran generalized SQP method presented in [8] to solve PDAE- e-mail: [email protected] constrained (partial differential algebraic equations) opti- S. Effati mization problems. The notion of KT-invexity from math- e-mail: [email protected] ematical programming was extended to the classical optimal control problem by authors of [9]. Optimal control problem S. Effati Á M. Pakdaman subject to mixed control-state constraints was investigated Center of Excellence on Soft Computing and Intelligent Information Processing, by Gerdts [10]. He stated the necessary conditions in terms of Ferdowsi University of Mashhad, Mashhad, Iran a local minimum principle and use of the Fischer–Burmeister 123 Neural Comput & Applic function. Buldaev [11] used perturbation methods in optimal Rtf control problems. Numerical methods based on extended min f0ðxðtÞ; uðtÞ; tÞ dt t0 one-step methods were investigated for solving optimal s:t ð1Þ control problems in [12]. Variational inequalities were used x_ ¼ gðxðtÞ; uðtÞ; tÞ to govern the existence results for optimal control problems xðt0Þ¼x0; in [13]. Chryssoverghi et al. [14] considered an optimal n m control problem described by nonlinear ordinary differential where xðtÞ2< is the state variable, uðtÞ2< is the equations (ODEs) with control and state constraints, control variable and t 2<: It is assumed that the integrand including point-wise state constraints. Because their prob- f0 has continuous first and second partial derivatives with lem may have no classical solutions, they formulated a respect to all its arguments. Also we assume that t0 and tf n relaxed form of problem and used discretization method. are fixed and g is Lipschitz continuous on a set X <: England et al. [15], in an interesting work, expressed optimal According to the problem (1), we can construct the control problems as differential algebraic equations. Local well-known Hamiltonian as: H(x(t), u(t), p(t), t) = f0(x(t), m stability of the solution to optimal control problems was u(t), t) ? p(t)Ág(x(t), u(t), t) where p 2< is the costate analyzed by Rodriguez [16]. An approximate-analytical vector. Suppose that we denote the optimal state, co-state solution for the HJB equation proposed via homotopy per- and control functions by x*(t), p*(t) and u*(t) respectively. turbation method in [17]. Cheng et al. in several works [18– Then, a necessary condition for u*(t) to minimize the 20] proposed a neural network solution for different types of objective functional in (1) is that: optimal control problems. They proposed (in [18]) neural HðxÃðtÞ; uÃðtÞ; pÃðtÞ; tÞHðxÃðtÞ; uðtÞ; pÃðtÞ; tÞð2Þ network solution for suboptimal control of non-holonomic chained form systems. In [19], they introduced a neural for all t 2½t0; tf and for all admissible controls. Equation network solution for finite-horizon H-infinity constrained (2) that indicates that an optimal control must minimize the optimal control of nonlinear systems. Finally in [20], they Hamiltonian is called the PMP (see [3]). Using PMP proposed fixed-final time-constrained optimal control laws provides a necessary condition for optimality. This PMP using neural networks to solve HJB equations for general shows that if x(t), p(t) and u(t) are the optimal values of the affine in the constrained nonlinear systems. Vrabie and state, costate and control respectively, they must satisfy the Lewis [21] presented a neural network approach to contin- following conditions: 8 uous-time direct adaptive optimal control for partially > oHðx;u;t;pÞ < ox ¼p_ðtÞ unknown nonlinear systems. oH x;u;t;p ð Þ ¼ x_ðtÞ ð3Þ In the last decade, artificial neural networks and other > op : oHðx;u;t;pÞ elements of soft computing and artificial intelligence played ouðtÞ ¼ 0 an important role in solving hard to solve problems arising By replacing the known functions f and g into the Ham- in science and engineering phenomenons. Applying the 0 iltonian, Eq. (3) gives a system of ODEs, which can be mentioned methods in many contests was successful, and solved via numerical methods or other existing methods. In the results were comparable with the other results obtained some cases, Eq. (3) introduces an straightforward ODE by mathematical algorithms. Lagaris et al. [22] used artifi- system that can be solved easily. But in most cases (spe- cial neural networks to solve ODEs and PDEs for both cially in practical problems), the system cannot be easily boundary value problems and initial value problems. Vrabie solved, and an approximated scheme must be applied. In et al. [21] proposed a method for solving continuous-time the next section, we try to apply neural network’s ability in direct adaptive optimal control for partially unknown non- function approximation to solve (3). linear systems, based on a reinforcement learning scheme. A basic neuron based on a perceptron can be observed in In Sect. 2, we introduce the optimal control problem and Fig. 1. It is proved that we can use multi-layer perceptrons present some basic concepts of neural network models. to approximate any nonlinear function with arbitrary Section 3 contains the main idea based on neural network accuracy(see [23]). models. In Sect. 4, we apply the new method for solving Here W is the weight vector of input layer, b is a vector some numerical problems, and finally Sect. 5 contains containing bias weights, and V is the output layer weights. concluding remarks. Wzv X + Sigmoid Out 2 Preliminaries b In this paper, we consider the following type of optimal control problem: Fig. 1 Basic perceptron 123 Neural Comput & Applic It can be observed that we can calculate the output from the It is easy to check that xT satisfies the initial condition following formulation: (xT(t0) = x0). Note that we may have p(.) = 0 for free end- 8 > Pk points. For example, if x(t0) is free, we must have p(t0) = 0, <> out ¼ virðziÞ and thus, we can define pT in (7)as:pT = (t - t0)np.For i¼1 4 > Pk ð Þ other initial (or boundary) conditions, we can construct :> zi ¼ wix þ bi appropriate trial functions. i¼1 By replacing the trial solutions into the Hamiltonian where k is the number of sigmoid units. The activation function, we can define a trial hamiltonian HT which is function here is the sigmoid function in the following conventional Hamiltonian function H where we replaced formula: the functions x, p and u by their corresponding trial format (x , p and u respectively) as H (x (t), u (t ), p (t), t) 1 T T T T T T f T rðxÞ¼ ð5Þ = f (x (t), u (t), t) ? p (t)Ág(x (t), u (t), t). Thus, the trial 1 þ eÀx 0 T T T T T Hamiltonian function contains the weights of neural net- Based on Kolmogorov theorem, it is proved that we can works. Since the trial solutions (7) must satisfy conditions implement any continuous function with a multi-layer (3), we replace them into the Eq.

Load more