49th IEEE Conference on Decision and Control December 15-17, 2010 Hilton Atlanta Hotel, Atlanta, GA, USA

Stochastic Model Predictive Control with Driver Behavior Learning for Improved Powertrain Control

M. Bichi, G. Ripaccioli, S. Di Cairano, D. Bernardini, A. Bemporad, I.V. Kolmanovsky

Abstract— In this paper we advocate the use of stochastic applications to tackle the uncertainty arising from the vehi- model predictive control (SMPC) for improving the perfor- cle environment. Concrete examples are stochastic dynamic mance of powertrain control algorithms, by optimally control- programming applications to hybrid electric vehicles (HEV) ling the complex system composed of driver and vehicle. While the powertrain is modeled as the deterministic component of energy management [3], [4] and emission control [5], and the dynamics, the driver behavior is represented as a stochastic linear stochastic for chassis control [6]. system which affects the vehicle dynamics. Since stochastic However, stochastic linear control is often inadequate to MPC is based on online numerical optimization, the driver capture the variety of driver behaviors, while stochastic model can be learned online, hence allowing the control algo- dynamic programming is numerically intensive and cannot rithm to adapt to different drivers and drivers’ behaviors. The proposed technique is evaluated in two applications: adaptive be easily updated if the underlying statistical model changes. cruise control, where the driver behavioral model is used to In recent years various stochastic model predictive control predict the leading vehicle dynamics, and series hybrid electric algorithms (SMPC) have been proposed, based on different vehicle (SHEV) energy management, where the driver model is prediction models and stochastic optimal control problems, used to predict the future power requests. see [7]–[9] and the references therein. In this paper, we I. INTRODUCTION model the deterministic dynamics by a linear system and the stochastic driver model by a . The choice of Modern automotive vehicles are complex systems where Markov chains to represent driver behaviors is motivated by the mechanical components interact with the control elec- previous literature [3], [5], and by the approximation prop- tronics and with the human driver. With the constant increase erties of Markov chains [10]. The overall model is used in a in powertrain complexity, tightening of emissions standards, finite horizon stochastic optimal control problem, where the increase in gas price, and the increased number of vehicle expected performance objective is optimized, subject to the functionalities, advanced control algorithms are needed to constraints on states and inputs. Concurrently, the Markov meet the specification requirements while keeping sensor and chain modeling the driver is updated online by applying a actuator costs limited. simple learning algorithm, based on linear filtering of the Model predictive control (MPC)[1] is an appealing can- transition frequencies. This allows the SMPC controller to didate for control of complex automotive systems, due to its adapt to changes in the driver behavior. capability of coordinating multiple constrained actuators and The paper is structured as follows. In Section II we discuss optimizing the system behavior with respect to a performance the driver model based on Markov chains and the correspond- objective. However, MPC requires a model of the dynamics, ing online learning algorithm, and in Section III we introduce and, in general, the more precisely the model represents the stochastic model predictive control algorithm used in this the real dynamics, the better the closed-loop performance paper. In Sections IV and V we present the applications of is. While suitable models are available for the dynamics of the proposed approach to the adaptive cruise control (ACC) most vehicle components [2], the overall automotive vehicle and to energy management in series hybrid electric vehicle behavior strongly depends on what the driver is doing. As (SHEV), respectively. The conclusions are summarized in such, including a prediction model of possible future driver’s Section VI. actions may increase the closed-loop performance of MPC. R Z Z In this paper we propose to model the driver as a stochas- Notation: , , 0+ denote the set of real, integer, and nonnegative integer numbers, respectively. For a set A, tic process whose output affects a deterministic model of th |A| denotes the cardinality. For a vector a, [a]i is the i the vehicle. Thus, the obtained model of the vehicle and th component, and for a matrix A, [A]ij is the ij element the driver dynamics is a stochastic dynamical system that th requires appropriate stochastic control algorithms. Stochastic and [A]i is the i row. We denote a square matrix of size control algorithms have been already proposed in automotive s × s entirely composed of zeros by 0s, and the identity by Is. Subscripts are dropped when clear from the context. M. Bichi, G. Ripaccioli, and D. Bernardini are with Dept. In- formation Engineering, University of Siena, Italy, ripaccioli, II. STOCHASTIC DRIVER MODEL AND LEARNING [email protected], [email protected] ALGORITHM S. Di Cairano is with Powertrain Control R&A, Ford Research and Adv. Engineering, [email protected] We model the actions of the driver on the vehicle by A. Bemporad is with Dept. Mechanical and Structural Engineering, a w(·) where w(k) ∈ W, for all k ∈ University of Trento, Italy [email protected] Z I.V. Kolmanovsky is with the Dept. Aerospace Engineering, the Univer- 0+ . We assume that at time k, the value w(k) can be sity of Michigan, Ann Arbor, MI, [email protected] measured, and, with a little abuse of notation, we denote

978-1-4244-7744-9/10/$26.00 ©2010 IEEE 6077 by w(k) the measured realization of the disturbance at Algorithm 1 On-line driver’s model learning procedure k ∈ Z0+. Depending on the specific application, vector w(k) 1: Given T , at k = 0 set: may represent different quantities, such as power request in 2: N = 0s, τ = 0; an HEV, acceleration, velocity, angular rate applied to the 3: i = argmin |w(0) − wh|; steering wheel, or any combination of the above. h∈{1,...,s} 4: for all k ≥ 1 do For prediction purposes, the random process generat- 5: τ = τ + 1; ing w is modelled as a Markov chain with states W = 6: j = argmin |w(k) − wh|; {w1, w2,..., ws}, where obviously wi ∈ W, for all h∈{1,...,s} i ∈ {1,..., s}. The cardinality |W | defines the trade off 7: [N]ij = [N]ij + 1; between stochastic model complexity and its precision. The 8: if τ = τmax then 9: for all h ∈ {1, 2,..., s} do Markov chain is defined by a transition probability matrix s s s −1 T ∈ R × , such that 10: [T ]h = ([N]h + λ[T ]h)(λ + l=1[N]hl) ; 11: end for P [T ]ij = Pr[w(k + 1) = wj|w(k) = wi], (1) 12: N = 0, τ = 0; 13: end if for all i, j ∈ {1,..., s}, where w(k) is the state of the Markov chain at time k. By using the Markov chain model, 14: i = j; 15: end for given w(k) = wi, the probability distribution of w(k + ℓ) is computed as ℓ ′ Pr[w(k + ℓ) = wj|w(k) = wi] = T ǫi j. (2) III. STOCHASTIC MODEL PREDICTIVE CONTROL th £¡ ¢ ¤ where the ǫi is the i unitary vector, i.e., [ǫ]i = 1, [ǫ]j = 0, In this paper we apply the SMPC formulation based on for all j 6= i. scenario enumeration and multi-stage stochastic optimization A straightforward extension of this model is to model the introduced in [12]. Consider a process whose discrete-time dynamics are modelled by the linear system stochastic process w(·) by a set of Markov chains Tm, m =

1,..., µ, where, in automotive applications, the currently x(k + 1) = Ax(k) + B1u(k) + B2w(k) (4a) active Markov chain is chosen at every instant depending y(k) = Cx(k) + D u(k) + D w(k), (4b) on current conditions such as velocity, temperature, or road 1 2 surface. where x(k) ∈ Rnx is the state, u(k) ∈ Rnu is the input, The Markov chain transition matrix T can be updated w(k) ∈ W is an additive stochastic disturbance. The state, online by different learning algorithms, with varying com- input, and output vectors may be subject to the following L constraints plexity [10], [11]. From a batch of measurement {w(k)}k=0, the Markov chain transition matrix T is estimated by x(k) ∈ X, u(k) ∈ U, y(k) ∈ Y, ∀k ∈ Z0+. (5) nij [T ]ij = , i, j ∈ {1,..., s}, (3) To simplify the exposition, we consider hereafter a scalar dis- n i turbance w. However, the approach described below is easily where nij = |Kij|, extended to multi-dimensional disturbances. For predicting k w k w i, the evolution of the disturbance w, a time-varying probabil- Kij = { | argmin | ( ) − h| = ′ h∈{1,...,s} ity vector p(k) = [p1(k), p2(k),..., ps(k)] is introduced, argmin |w(k + 1) − wh| = j}, which defines the probability of disturbance realization at h∈{1,...,s} time k, 1 s and ni = j=1 nij, for all i ∈ {1,..., s}. When new pj(k) = Pr[w(k) = wj], j = 1, 2,..., s, (6) values of w(·) are collected, the Markov chain is updated. P s Z The learning procedure used here is detailed in Algorithm 1, with j=1 pj(k) = 1, for all k ∈ 0+. We model the where N ∈ Zs×s is used to store the measured data. evolutiPon of p(k) by the Markov chain (1). As a consequence 0+ the complete model of plant and disturbance is Algorithm 1 is a linear filtering algorithm that estimates T and, if w(·) is generated by a Markov chain, it can be x(k + 1) = Ax(k) + B1u(k) + B2w(k) (7a) shown to converge to the correct value, see, e.g., [10], [11]. y(k) = Cx(k) + D1u(k) + D2w(k) (7b) The parameter λ > 0 acts as the filter constant, where a Pr[w(k + 1) = w |w(k) = w ] = [T ] . (7c) lower value increases the convergence speed at the price of j i ij higher sensitivity to noise. The adopted SMPC problem formulation is based on a maximum likelihood approach, where at every time-step an optimization tree (scenario tree) is built using the up- dated information on the system state and on the stochastic 1Note that minimizer in (4) may not be unique. For those cases, omitted here for the sake of simplicity, a selection rule can be included. For instance, disturbance. Each node of the tree represents a predicted the smallest minimizer can be chosen. disturbance scenario which is taken into account in the

6078 optimization problem. Starting from the root node, which is associated with the current measurement of w(k), a list of candidate nodes is generated by considering all the possible future values of the disturbance, together with their realiza- tion probability. Then, the node with maximum probability Fig. 1. Adaptive cruise control operation. is added to the tree. The procedure is repeated iteratively, by generating at every step new candidates as children of the last node added to the tree, until the tree contains a nmax nodes. is solved for the obtained optimization tree, (iii) the input Note that every sequence of connected nodes (i.e., a path) in vector u(k) = u1 is applied to the system. the tree represents a disturbance sequence realization, where In the next sections we demonstrate two applications of the number of nodes is the length of the path. SMPC with learning in the automotive domain. The constructed tree produces a “multiple-horizon” ap- proach, where different paths may have different prediction IV. APPLICATION TOADAPTIVE CRUISE CONTROL horizons. The tree generation algorithm expands the tree Adaptive cruise control (ACC)[13] extends the function- in the most likely direction, so that the paths with higher alities of conventional cruise control. In conventional cruise probability are extended longer in the future, since they will control, the reference velocity set by the driver is tracked have more impact in the performance optimization. Causality by controlling the throttle, rejecting disturbances such as of the resulting control law is enforced by allowing only one road slope and air drag. Adaptive cruise control, in addition, control move for every node, except leaf nodes (i.e., nodes enforces a separation distance from the leading traffic to with no successor) that have no associated control move. The increase driving comfort and safety. In what follows, we reader is referred to [12] for further details. consider two vehicles, the follower or host, which is the Let us introduce the following quantities to formally define vehicle equipped with an ACC system, and the leader or the proposed control problem: target vehicle. The ACC controls the follower acceleration in order to track the desired velocity as close as possible • T = {T1, T2,..., Tn}: the set of tree nodes. Nodes are indexed progressively as they are added to the tree (i.e., without violating a minimum separation distance between leader and follower, in spite the fact that the acceleration of T1 is the root node and Tn is the last node added). the leader is time-varying and not controllable. • xN , uN : the state and the input, respectively, associated with node N . As a prediction model for the MPC control problem, we consider the scheme represented in Figure 1, where v(k) • πN : the probability of reaching node N from T1. and a k are the speed and the acceleration of the follower, • pre(N ): the predecessor of node N . ( ) respectively, and v k is the velocity of the leader. The • succ(N , j): the successor of node N with mode j. l( ) acceleration a(k) is modeled as the integrator • S ⊂ T : the set of leaf nodes, defined as S , {Ti ∈ , i , ,..., n succ , j , j , ,..., s . T = 1 2 : (Ti ) 6∈ T = 1 2 } a(k + 1) = a(k) + Tsu(k), (9) After the tree has been built, the following stochastic MPC where T = 1s is the sampling period, and the control input optimization problem is solved s u(k) is the rate of change of acceleration (jerk). The leader min π (x − x )Q(x − x ) and follower velocities are u i i r i r i∈TX\{T1} v(k + 1) = v(k) + Tsa(k), (10) + π (u′ Ru + y′ Sy ) (8a) j j j j j vl(k + 1) = vl(k) + Tsal(k), (11) j∈XT \S where al(k) is the leader acceleration, that is assumed to be s.t. x1 = x(k), w1 = w(k), (8b) a stochastic disturbance. Thus, the distance d(k) between the xi = Axpre(i) + B1upre(i) + B2wpre(i), leader and the follower evolves as i ∈ T \{T1}, (8c) d(k + 1) = d(k) + Ts(vl(k) − v(k)). (12) yi = Cxi + D1ui + D2wi, i ∈ T \S, (8d) The system dynamics are modelled by (4a) where x(k) = xi ∈ X, i ∈ T \{T1}, (8e) ′ [d(k) v(k) a(k) vl(k)] , w(k) = al(k), and ui ∈ U, yi ∈ Y, i ∈ T \S, (8f) 1 −Ts 0 Ts 0 0 where u = {u }mu is the decision vector and m = |T |−|S| A = 0 1 Ts 0 , B = 0 , B = 0 . (13) i i=1 u · 0 0 1 0 ¸ 1 · Ts ¸ 2 · 0 ¸ is the number of optimization variables. Then, the decision 0 0 0 1 0 Ts vector element u1 associated to the root node T1 of the tree In order to guarantee comfort and safety, the state and is used as the control input u(k). manipulated input are subject to the constraint (5), where Summarizing, the stochastic model predictive control al- gorithm is composed of three steps. At any control cycle 4 X , {x ∈ R :[x]1 ≥ dmin(k), 0 ≤ [x]2 ≤ vref }, (14a) k ∈ Z , (i) the optimization tree is built from the current 0+ U , R value of the state and measured disturbance, (ii) problem (8) {u ∈ : umin ≤ u ≤ umax}. (14b)

6079 35 The minimum separation distance varies with the velocity v(k), 30

dmin(k) = δ + γv(k), (15) 25

where δ m is a constant value that models the minimum ] = 3 s 20 /

distance for the case where v(k) = 0, and γ = 2s is the m [ 15 headaway time-gap, the time needed to reach the current v SMPC position of the leader. The bound vref is desired speed set 10 FTMPC PMPC by the driver, which shall not be exceeded, and the jerk is v 5 ref 3 v bounded for guaranteeing comfort, umax = −umin = 3m/s . l

0 In addition, the follower velocity must be non-negative. 0 50 100 150 200 250 300 350 400 For the ACC control problem, in model (4) the disturbance t[s] (a) Velocity and reference velocity w(k) is the acceleration al(k) of the leading vehicle. This is modelled by a Markov chain that can take s = 9 different 100 values. The Markov chain is first trained offline by (3) with 90 acceleration profiles from several standard driving cycles, 80 then the Markov chain is adapted online by Algorithm 1. 70 60

In the optimization problem (8) we set ]

m 50 dref Qd 0 0 0 [ d x = vref , Q = 0 Qv 0 0 , R = Q , 40 ref 0 0 0 0 0 u · ¸ · ¸ 30 0 0 0 0 0 d (stoch) d (frozen) 20 where Qd, Qv, Qu are the weights on the tracking error d (presc) dref 10 of relative distance, on the tracking error of desired velocity, dmin

0 and on jerk, respectively. Also the reference distance is time- 0 50 100 150 200 250 300 350 400 t[s] varying (b) Distance, reference distance, and distance constraint dref (k) = δref + γref v(k) (16) Fig. 2. Comparison between the SMPC, FTMPC and PMPC for adaptive where δ = 4m and γ = 3s. By the term that weighs u cruise control in (8a), the controller tries to minimize the jerk to provide smooth accelerations that improve comfort and reduce fuel consumption. the power produced by a generator connected to the internal We have simulated the closed-loop system, where the combustion engine (ICE). A preliminary study of SMPC European Urban Driving Cycle (EUDC) is used to model application to this problem was proposed by the authors vl, the speed reference is vref = 26m/s, the weights are in [15]. Here, we extend the approach with online learning of 4 Qd = 0.1, Qv = 5, Qu = 10 , and the optimization the Markov chain, and by considering a refined model with tree which defines the optimal control problem is built with both regenerative breaking (use of the motor to recharge the nmax = 50 nodes. We have compared the SMPC controller battery) and non-regenerative braking capabilities. with a deterministic “frozen-time” MPC (FTMPC), where The purpose of the energy management controller is to w is assumed constant in prediction, and with a “prescient” minimize the fuel consumption by optimally delivering the MPC (PMPC), where the disturbance w is exactly known power requested by the motor P , which is a function in advance on the entire prediction horizon. Figure 2(a) req of the driver commands on the gas and brake pedals and shows a comparison, in terms of vehicle speed, between the of the current vehicle conditions. The energy management controllers. The SMPC has an intermediate behavior between system selects how much power P must be provided by the FTMPC and the PMPC controller, where the latter knows mec the ICE through the generator, and how much power P by the variations of v in advance. Indeed, SMPC provides better el l the battery. The power balance equation distance and velocity reference tracking than the simpler FTMPC. One can also notice how the SMPC anticipates Preq(k) = Pel(k) + Pmec(k) − Pbr(k), ∀k ∈ Z0+ (17) driver’s action with respect to FTPMC, especially when the is enforced, where P ≥ 0 is the power drained by leader stops accelerating at 60s and in the long braking action br conventional friction brakes, in case regenerative braking is at 370s. not sufficient to provide the desired vehicle braking power. V. APPLICATION TO ENERGY MANAGEMENT INHYBRID Since the rotational dynamics of ICE, motor, and generator ELECTRIC VEHICLES are much faster than the battery charging dynamics, the main As a second application of SMPC, we consider the energy relevant dynamics are the ones of the (normalized) battery management problem of a series HEV (hybrid electric vehi- state of charge cle)[14], where only the electric motor is directly connected SoC(k + 1) = SoC(k) − KT P (k), (18) to the traction driveline. The power supplied to the motor is s el the combination of the power provided by a battery, and of where SoC ∈ [0 1], SoC = 1 corresponds to a fully charged

6080 battery, Ts = 1s is the sampling period, K > 0 is a scalar The desired mechanical power is converted into the optimal parameter identified for a generic HEV battery. Note that a engine operating point by the static relation positive value of P indicates that power is provided by the el τ k , ω k f P k , battery to the motor. [ ( ) ( )] = ( mec( )) The SMPC-based energy management system controls the based on the stationary optimal engine power curve. A conventional braking power Pbr and the variation of the low-level controller provides tracking of the desired engine mechanical power provided by the ICE , ∆P , where operating point. In order to operate the engine always around the optimal power curve, the mechanical power transients ∆P (k) = P (k) − P (k − 1). (19) mec mec should be short, which requires reduced mechanical power By combining (17)–(19), we get a model as in (4) with variations [15]. This also reduces power losses due to inertia. Thus, the cost at each node is x(k) = SoC(k) , u(k) = ∆P (k) , Pmec(k−1) Pbr(k) ∗ 2 ∗ 2 h i h i J (k) = r∆P ∆P (k) + qP (Pmec(k) − Pmec) + y(k) = Pel(k), w(k) = Preq(k), (20) 2 2 qsoc(SoC(k) − SoCref ) + rbrPbr(k) , (21) 1 KTs KTs −KTs −KTs A = 0 1 , B1 = 1 0 , B2 = 0 , ∗ where P is the ICE maximum efficiency power, SoCref C = £[ 0 −1 ] ,¤ D1 = [ −£1 1 ] , D2 =¤ 1. £ ¤ mec is the reference state of charge, the weight r∆P enforces Power request Preq(k) is the stochastic disturbance w(k), smooth mechanical power variations, the weight qP pushes and battery power Pel(k) is the output, since it is uniquely the system to operate closer to the maximum efficiency assigned by (17), given x(k) and u(k). power, and qsoc and rbr penalizes deviations from battery setpoint and the use of the friction brakes, respectively. A. Stochastic model predictive controller design From (21), the cost function (8a) is defined by setting

In order to design the stochastic MPC controller, power SoCref qsoc 0 r∆P 0 xref = P ∗ , Q = 0 q , R = 0 r , S = 0. request, which is decided by the driver, is modeled as a mec P br h i £ ¤ £ ¤ Markov chain (1), accordingly to Section II. In [15] the The constraints in SMPC problem (8) are used to enforce authors have discussed the case where the Markov chain desired operating ranges for the variables, which result in in- model is trained offline based on multiple standard driving creased battery lifetime and enforced electric and mechanical cycles data, hence representing an average driver behavior, limits. In details, we impose (14) where which is an approach similar to the one used in stochastic X R2 dynamic programming [3]. In this paper we illustrate the = {x ∈ : SoCmin ≤ [x]1 ≤ SoCmax, case where the transition probabilities are learned online by 0 ≤ [x]2 ≤ Pmec,max}, Algorithm 1. 2 U = {u ∈ R : ∆Pmin ≤ [u]1 ≤ ∆Pmax, [u]2 ≥ 0}, SMPC has a clear advantage over stochastic dynamic pro- Y = {y ∈ R : P ≤ [y] ≤ P }. gramming with respect to adaptation to changes in driver’s el,min 2 el,max behavior. Stochastic dynamic programming does not easily To guarantee that (17) is enforced at each sample time, the allow for updates in the control law in reaction to observed constraint on ∆P (k) has been implemented as a soft con- changes in the stochastic model, because of the numerically straint. For additional details on controller tuning see [15]. intensive computations required to solve the dynamic pro- gram. B. Simulations of the SMPC energy management controller With respect to using a fixed Markov chain learned offline The quasi-static nonlinear simulation model of a light as in [15], we show adaptation provides improvements in hybrid vehicle derived from the QSS toolbox [14] is used fuel economy. By learning transition probabilities on line, the as for generating simulation data. The controller parameters Markov chain better represents a specific drive cycle/driver, in the simulations are SoCmin = 0.4, SoCmax = 0.6, and the prediction capabilities of SMPC improve conse- Pmec,max = 20 kW, ∆Pmax = −∆Pmin = 5 kW, Pel,max = quently. This approach is also more meaningful in “every −Pel,min = 40 kW. The constraints on SoC are set tight day driving”, which does not exactly match the standard around 50% of battery charge to preserve battery lifetime. drive cycles, but it has indeed a specific pattern that reflects The weights in (8a) are qSoC = 500, qP = 0.2, r∆P = 0.4, the commonly travelled road, the local traffic flows, and the rbr = 1000, and the state references are SoCref = 0.5 ∗ specific driving style of the driver. and Pmec,ref = Pmec = 15.87 kW. The optimization tree For predicting Preq, a Markov chain with s = 16 states is defining the optimal control problem has nmax = 100 nodes. used, that is initialized by T = I. We run the controller mul- The SMPC controller is again compared with a frozen- tiple times through the drive cycle used for the simulations time MPC controller (FTMPC) that assumes Preq to be in Section V-B in order to learn the transition probabilities. constant in prediction, and with a prescient MPC controller The stochastic MPC problem (8) is formulated based on (PMPC), that has perfect knowledge of the future power re- model (4) where the dynamic parameters are defined by (20). quest along the entire prediction horizon. For the simulations, By solving (8), the SMPC algorithm selects the optimal we have used the New European Driving Cycle (NEDC). mechanical power variation and conventional braking power. Figures 3 and 4 show the mechanical power variation and the

6081 TABLE I 50.5 SIMULATION RESULTS ON THE NEDC CYCLE

fuel cons. [kg] % Fuel Improv. 50 FTMPC 0.281 − ] %

SMPC (static) 0.243 13.5% [ 49.5 SMPC (adaptive) 0.199 29.2% C o

PMPC 0.197 29.9% S 49 FTMPC PMPC 0.5 SMPC

48.5 0 200 400 t600[s] 800 1000 1200

0 Fig. 4. State of charge SoC for SMPC (solid line), PMPC (dashed line),

W] FTMPC (dash-dotted line) k [ P −0.5 ∆ respect to deterministic MPC schemes, due to the ability of SMPC of providing better predictions without violating FTMPC PMPC causality, as it happens in anticipative MPC schemes. −1 SMPC

0 200 400 t600[s] 800 1000 1200 REFERENCES

∆P [1] C. Garcia, D. Prett, and M. Morari, “Model predictive control: theory Fig. 3. Mechanical power variation for SMPC (solid line), PMPC and practice–a survey,” Automatica(Oxford), vol. 25, no. 3, pp. 335– (dashed line), FTMPC (dash-dotted line) 348, 1989. [2] L. Guzzella and C. Onder, Introduction to modeling and control of internal combustion engine systems. Springer, 2004. battery state of charge for the three controllers, respectively. [3] C. Lin, H. Peng, and J. Grizzle, “A stochastic control strategy for hybrid electric vehicles,” in Proc. American Contr. Conf., vol. 5, A comparison of the resulting fuel economy on the NEDC no. 30, 2004, pp. 4710–4715. cycle (lasting 1220s) is reported in Table I, together with the [4] L. Johannesson, M. Asbogard, and B. Egardt, “Assessing the potential improvement with respect to the base FTMPC controller. of predictive control for hybrid vehicle powertrains using stochastic dynamic programming,” IEEE Transactions on Intelligent Transporta- In Table I the fuel consumption of the SMPC controller tion Systems, vol. 8, no. 1, p. 71, 2007. discussed in [15], where no adaptation was used, is also [5]I. Kolmanovsky, I. Siverguina, and B. Lygoe, “Optimization of pow- shown. Even if the fuel consumption is not explicitly min- ertrain operating policy for feasibility assessment and calibration: stochastic dynamic programming approach,” in Proc. American Contr. imized, SMPC provides better fuel economy than FTMPC, Conf., vol. 2, Anchorage, AK, 2002. since SMPC performs better prediction of the future power [6] D. Wilson, R. Sharp, and S. Hassan, “The application of linear optimal requests, and the minimization of cost function (21) forces to the design of active automotive suspensions.” Vehicle Syst. Dyn., vol. 15, no. 2, pp. 105–118, 1986. an efficient operation of the powertrain. In fact, the battery [7] A. Bemporad and S. Di Cairano, “Optimal control of discrete hybrid power is used to mitigate the ICE power transients, that are stochastic automata,” in Hybrid Systems: Computation and Control, in general inefficient, and the controller operates the ICE on ser. Lecture Notes in Computer Science, M. Morari and L. Thiele, Eds., no. 3414. Springer-Verlag, 2005, pp. 151–167. the optimal curve, possibly close to the maximum efficiency [8] P. Couchman, M. Cannon, and B. Kouvaritakis, “Stochastic MPC with ∗ point Pmec. inequality stability constraints,” Automatica, vol. 42, pp. 2169–2174, 2006. VI. CONCLUSIONS [9] J. Primbs, “Stochastic receding horizon control of constrained linear systems with state and control multiplicative noise,” in Proc. American We have developed an approach for control of complex Contr. Conf., New York, NY, 2007, pp. 4470–4475. powertrain systems based on modeling the vehicle as a [10] B. D. O. Anderson, “From Wiener to Hidden Markov Models,” Control deterministic dynamical system, and the driver as a stochastic Systems Magazine, vol. 19, no. 3, pp. 41–51, 1999. [11] L. Rabiner, “A tutorial on hidden Markov models and selected process, whose dynamics is updated on line. Stochastic applications in speech recognition,” Proc. of the IEEE, vol. 77, no. 2, model predictive control is applied to optimize expected pp. 257–286, 1989. performance over a tree of scenarios, while enforcing con- [12] D. Bernardini and A. Bemporad, “Scenario-based model predictive control of stochastic constrained linear systems,” in Proc. 48th IEEE straints on states, inputs, and outputs. From a computational Conf. on Decision and Control, Shanghai, China, 2009, pp. 6333– viewpoint, by assuming a linear system model we solve 6338. the SMPC problem via standard quadratic programming. [13] C. Liang and H. Peng, “Optimal adaptive cruise control with guaran- teed string stability,” Vehicle system dynamics, vol. 32, no. 4-5, pp. Compared to stochastic dynamic programming, the proposed 313–330, 1999. controller is easily reconfigurable to changing stochastic [14] L. Guzzella and A. Amstutz, “QSS-Toolbox Manual,” Institut parameters and it can more easily handle high order models. fu¨r Mess-und Regeltechnik, Eidgeno¨ssische Technische Hochschule Zu¨rich. Zu¨rich, 2005. The overall procedure has been exemplified in two automo- [15] G. Ripaccioli, D. Bernardini, S. Di Cairano, A. Bemporad, and tive applications, the adaptive cruise control and the energy I. Kolmanovsky, “A stochastic model predictive control approach for management of hybrid electric vehicles. In both cases we series hybrid electric vehicle power management,” in Proc. American Contr. Conf., Baltimore, MD, 2010, pp. 5844–5849. have been able to demonstrate improved performance with

6082