Stochastic Model Predictive Control with Driver Behavior Learning for Improved Powertrain Control
Total Page:16
File Type:pdf, Size:1020Kb
49th IEEE Conference on Decision and Control December 15-17, 2010 Hilton Atlanta Hotel, Atlanta, GA, USA Stochastic Model Predictive Control with Driver Behavior Learning for Improved Powertrain Control M. Bichi, G. Ripaccioli, S. Di Cairano, D. Bernardini, A. Bemporad, I.V. Kolmanovsky Abstract— In this paper we advocate the use of stochastic applications to tackle the uncertainty arising from the vehi- model predictive control (SMPC) for improving the perfor- cle environment. Concrete examples are stochastic dynamic mance of powertrain control algorithms, by optimally control- programming applications to hybrid electric vehicles (HEV) ling the complex system composed of driver and vehicle. While the powertrain is modeled as the deterministic component of energy management [3], [4] and emission control [5], and the dynamics, the driver behavior is represented as a stochastic linear stochastic optimal control for chassis control [6]. system which affects the vehicle dynamics. Since stochastic However, stochastic linear control is often inadequate to MPC is based on online numerical optimization, the driver capture the variety of driver behaviors, while stochastic model can be learned online, hence allowing the control algo- dynamic programming is numerically intensive and cannot rithm to adapt to different drivers and drivers’ behaviors. The proposed technique is evaluated in two applications: adaptive be easily updated if the underlying statistical model changes. cruise control, where the driver behavioral model is used to In recent years various stochastic model predictive control predict the leading vehicle dynamics, and series hybrid electric algorithms (SMPC) have been proposed, based on different vehicle (SHEV) energy management, where the driver model is prediction models and stochastic optimal control problems, used to predict the future power requests. see [7]–[9] and the references therein. In this paper, we I. INTRODUCTION model the deterministic dynamics by a linear system and the stochastic driver model by a Markov chain. The choice of Modern automotive vehicles are complex systems where Markov chains to represent driver behaviors is motivated by the mechanical components interact with the control elec- previous literature [3], [5], and by the approximation prop- tronics and with the human driver. With the constant increase erties of Markov chains [10]. The overall model is used in a in powertrain complexity, tightening of emissions standards, finite horizon stochastic optimal control problem, where the increase in gas price, and the increased number of vehicle expected performance objective is optimized, subject to the functionalities, advanced control algorithms are needed to constraints on states and inputs. Concurrently, the Markov meet the specification requirements while keeping sensor and chain modeling the driver is updated online by applying a actuator costs limited. simple learning algorithm, based on linear filtering of the Model predictive control (MPC)[1] is an appealing can- transition frequencies. This allows the SMPC controller to didate for control of complex automotive systems, due to its adapt to changes in the driver behavior. capability of coordinating multiple constrained actuators and The paper is structured as follows. In Section II we discuss optimizing the system behavior with respect to a performance the driver model based on Markov chains and the correspond- objective. However, MPC requires a model of the dynamics, ing online learning algorithm, and in Section III we introduce and, in general, the more precisely the model represents the stochastic model predictive control algorithm used in this the real dynamics, the better the closed-loop performance paper. In Sections IV and V we present the applications of is. While suitable models are available for the dynamics of the proposed approach to the adaptive cruise control (ACC) most vehicle components [2], the overall automotive vehicle and to energy management in series hybrid electric vehicle behavior strongly depends on what the driver is doing. As (SHEV), respectively. The conclusions are summarized in such, including a prediction model of possible future driver’s Section VI. actions may increase the closed-loop performance of MPC. R Z Z In this paper we propose to model the driver as a stochas- Notation: , , 0+ denote the set of real, integer, and nonnegative integer numbers, respectively. For a set A, tic process whose output affects a deterministic model of th |A| denotes the cardinality. For a vector a, [a]i is the i the vehicle. Thus, the obtained model of the vehicle and th component, and for a matrix A, [A]ij is the ij element the driver dynamics is a stochastic dynamical system that th requires appropriate stochastic control algorithms. Stochastic and [A]i is the i row. We denote a square matrix of size control algorithms have been already proposed in automotive s × s entirely composed of zeros by 0s, and the identity by Is. Subscripts are dropped when clear from the context. M. Bichi, G. Ripaccioli, and D. Bernardini are with Dept. In- formation Engineering, University of Siena, Italy, ripaccioli, II. STOCHASTIC DRIVER MODEL AND LEARNING [email protected], [email protected] ALGORITHM S. Di Cairano is with Powertrain Control R&A, Ford Research and Adv. Engineering, [email protected] We model the actions of the driver on the vehicle by A. Bemporad is with Dept. Mechanical and Structural Engineering, a stochastic process w(·) where w(k) ∈ W, for all k ∈ University of Trento, Italy [email protected] Z I.V. Kolmanovsky is with the Dept. Aerospace Engineering, the Univer- 0+ . We assume that at time k, the value w(k) can be sity of Michigan, Ann Arbor, MI, [email protected] measured, and, with a little abuse of notation, we denote 978-1-4244-7744-9/10/$26.00 ©2010 IEEE 6077 by w(k) the measured realization of the disturbance at Algorithm 1 On-line driver’s model learning procedure k ∈ Z0+. Depending on the specific application, vector w(k) 1: Given T , at k = 0 set: may represent different quantities, such as power request in 2: N = 0s, τ = 0; an HEV, acceleration, velocity, angular rate applied to the 3: i = argmin |w(0) − wh|; steering wheel, or any combination of the above. h∈{1,...,s} 4: for all k ≥ 1 do For prediction purposes, the random process generat- 5: τ = τ + 1; ing w is modelled as a Markov chain with states W = 6: j = argmin |w(k) − wh|; {w1, w2,..., ws}, where obviously wi ∈ W, for all h∈{1,...,s} i ∈ {1,..., s}. The cardinality |W | defines the trade off 7: [N]ij = [N]ij + 1; between stochastic model complexity and its precision. The 8: if τ = τmax then 9: for all h ∈ {1, 2,..., s} do Markov chain is defined by a transition probability matrix s s s −1 T ∈ R × , such that 10: [T ]h = ([N]h + λ[T ]h)(λ + l=1[N]hl) ; 11: end for P [T ]ij = Pr[w(k + 1) = wj|w(k) = wi], (1) 12: N = 0, τ = 0; 13: end if for all i, j ∈ {1,..., s}, where w(k) is the state of the Markov chain at time k. By using the Markov chain model, 14: i = j; 15: end for given w(k) = wi, the probability distribution of w(k + ℓ) is computed as ℓ ′ Pr[w(k + ℓ) = wj|w(k) = wi] = T ǫi j. (2) III. STOCHASTIC MODEL PREDICTIVE CONTROL th £¡ ¢ ¤ where the ǫi is the i unitary vector, i.e., [ǫ]i = 1, [ǫ]j = 0, In this paper we apply the SMPC formulation based on for all j 6= i. scenario enumeration and multi-stage stochastic optimization A straightforward extension of this model is to model the introduced in [12]. Consider a process whose discrete-time dynamics are modelled by the linear system stochastic process w(·) by a set of Markov chains Tm, m = 1,..., µ, where, in automotive applications, the currently x(k + 1) = Ax(k) + B1u(k) + B2w(k) (4a) active Markov chain is chosen at every instant depending y(k) = Cx(k) + D u(k) + D w(k), (4b) on current conditions such as velocity, temperature, or road 1 2 surface. where x(k) ∈ Rnx is the state, u(k) ∈ Rnu is the input, The Markov chain transition matrix T can be updated w(k) ∈ W is an additive stochastic disturbance. The state, online by different learning algorithms, with varying com- input, and output vectors may be subject to the following L constraints plexity [10], [11]. From a batch of measurement {w(k)}k=0, the Markov chain transition matrix T is estimated by x(k) ∈ X, u(k) ∈ U, y(k) ∈ Y, ∀k ∈ Z0+. (5) nij [T ]ij = , i, j ∈ {1,..., s}, (3) To simplify the exposition, we consider hereafter a scalar dis- n i turbance w. However, the approach described below is easily where nij = |Kij|, extended to multi-dimensional disturbances. For predicting k w k w i, the evolution of the disturbance w, a time-varying probabil- Kij = { | argmin | ( ) − h| = ′ h∈{1,...,s} ity vector p(k) = [p1(k), p2(k),..., ps(k)] is introduced, argmin |w(k + 1) − wh| = j}, which defines the probability of disturbance realization at h∈{1,...,s} time k, 1 s and ni = j=1 nij, for all i ∈ {1,..., s}. When new pj(k) = Pr[w(k) = wj], j = 1, 2,..., s, (6) values of w(·) are collected, the Markov chain is updated. P s Z The learning procedure used here is detailed in Algorithm 1, with j=1 pj(k) = 1, for all k ∈ 0+. We model the where N ∈ Zs×s is used to store the measured data. evolutiPon of p(k) by the Markov chain (1). As a consequence 0+ the complete model of plant and disturbance is Algorithm 1 is a linear filtering algorithm that estimates T and, if w(·) is generated by a Markov chain, it can be x(k + 1) = Ax(k) + B1u(k) + B2w(k) (7a) shown to converge to the correct value, see, e.g., [10], [11].