MATHEMATICAL CONTROL AND doi:10.3934/mcrf.2019025 RELATED FIELDS Volume 9, Number 3, September 2019 pp. 541–570

A NON-EXPONENTIAL DISCOUNTING TIME-INCONSISTENT STOCHASTIC OPTIMAL CONTROL PROBLEM FOR JUMP-DIFFUSION

Ishak Alia Department of Mathematics University of Bordj Bou Arreridj, 34000, Algeria

(Communicated by Jiongmin Yong)

Abstract. In this paper, we study general time-inconsistent stochastic con- trol models which are driven by a stochastic differential equation with ran- dom jumps. Specifically, the time-inconsistency arises from the presence of a non-exponential discount function in the objective functional. We consider equilibrium, instead of optimal, solution within the class of open-loop control- s. We prove an equivalence relationship between our time-inconsistent prob- lem and a time-consistent problem such that the equilibrium controls for the time-consistent problem coincide with the equilibrium controls for the time- inconsistent problem. We establish two general results which characterize the open-loop equilibrium controls. As special cases, a generalized Merton’s port- folio problem and a linear-quadratic problem are discussed.

1. Introduction. We consider in this paper stochastic control problems when the system under consideration is governed by a SDE of the following type   dX (s) = b (s, X (s) , u (s)) ds + σ (s, X (s) , u (s)) dW (s)  Z + c (s, X (s−) , u (s−) , z) N˜ (ds, dz), (1)  Z  X (t) = x, and for any fixed initial pair (t, x), the objective is to maximize the expected utility functional "Z T # J (t, x; u (·)) = Et,x ν (t, s) f (s, u (s)) ds + ν (t, T ) h (X (T )) , (2) t over the set of the admissible controls. In the above model b, σ, c, f and h are deterministic functions. Especially, ν (t, s) f (s, u (s)) is the discounted local utility and ν (t, T ) h (X (T )) is the terminal utility, where ν (·, ·) represents the discount function. The common assumption in most of the existing literature is that the discount rate of is constant over time, leading to the exponential form of the discount function: ν (t, s) = e−δ(s−t) and ν (t, T ) = e−δ(T −t),

2010 Mathematics Subject Classification. Primary: 93E20, 91B08, 91B70; Secondary: 62P20, 60H30, 60H10, 49N90, 93E25. Key words and phrases. Time-inconsistency, stochastic control, non-exponential discounting, equilibrium control, dynamic programming principle, stochastic maximum principle, Poisson point process.

541 542 ISHAK ALIA where δ > 0 is some constant which represents the discount rate. There is a very good reason for this assumption: It is easy to see that with the above form of the discount function, the optimal control problem (1)-(2) is time-consistent in the sense that Bellman’s optimality principle is satisfied. Therefore, the dynamic program- ming approach can be easily applied and one may derive a closed-loop representation of the optimal control via a Hamilton-Jacobi-Bellman (HJB) equation. However, results from experimental studies contradict the above assumption (see, for exam- ple, [1] or [22]), indicating that the discount rate changes over time and the discount rates for the near future are much lower than the discount rates for the time further away in future. Specially, Ainslie [1] performed empirical studies on human and animal behavior and found that discount functions are almost hyperbolic; that is, they decrease like a negative power of time rather than an exponential. On the other hand, Strotz [31] showed that as soon as the discounting is non- exponential, the models become time-inconsistent in the sense that they do not admit a Bellman’s optimality principle. Therefore, an optimal control might not remain optimal as time goes. However, since time-consistency is important for a rational decision maker, some researchers began to find time- consistent strategies for non-exponential discounting optimal control problems. The main approach is to formulate the time-inconsistent problem in a game theoretic framework and look for the Nash equilibrium solutions. For a detailed introduc- tion see e.g. [31], [29], [28], [16], [4], [20], [13], [14], [15], [6], [8], [9], [23], [34], [36], [35], [37], [39], [12], [33] and references therein. Specially, Ekeland and Lazrak [13] and Ekeland and Pirvu [14] investigated the optimal consumption-investment problem under for deterministic and stochastic models. A- mong their very important contributions, they provided a precise definition of the feedback equilibrium control in continuous-time, using a spike variation formula- tion. In addition, they derived an extended HJB equation (that is an extension of the standard Hamilton-Jacobi-Bellman equation displaying a non local term) along with a verification theorem that characterizes feedback equilibriums. An extend- ed HJB equation was derived in Mar´ın-Solano and Navas [23] which investigated a consumption-investment problem with non-constant discount rate for both naive and sophisticated agents. Bj¨orkand Murgoci [6] generalized the extended HJB equations method to a quite general class of time-inconsistent stochastic control problems. In addition, they proved that for every time-inconsistent problem, there exists an associated time-consistent problem such that the optimal control and the optimal value function for the consistent problem coincide with the equilibrium control and value function, respectively, for the inconsistent problem. Yong [34] provided an alternative approach by discretization of time for the game in his de- terministic time-inconsistent linear quadratic (LQ) model and he constructed an equilibrium solution via some class of coupled Riccati-Voltera equations. Yong [36] still by discretization of time for the game, investigated a class of general dis- counting time-inconsistent stochastic optimal control problems and he derived the so-called equilibrium HJB equation along with a verification theorem that charac- terizes closed-loop equilibrium controls. Following Yong’s approach, Zhao et al. [40] studied the consumption-investment problem under a general discount function and a logarithmic utility function. Hu et al. ([17], [18]) dealt with another kind of time- inconsistent stochastic LQ control problems. In their model the time-inconsistency arises from the presence of a quadratic term of the expected state as well as a state-dependent term in the objective functional. Among their achievements, they NON-EXPONENTIAL DISCOUNTING OPTIMAL CONTROL PROBLEMS 543 suggested the concept of the Nash equilibrium control within the class of open-loop controls. Using a duality method, they characterized the open-loop equilibrium control via a stochastic system that includes a flow of forward-backward stochastic differential equations (FBSDEs), whose solvability remains a challenging open prob- lem except for some special cases. The work Djehiche and Huang [11] extended [17] by characterizing equilibrium controls via a Pontryagin’s type stochastic maximum principle. More recently, Hu et al. [19] extended the work [18] by incorporating control constraints. Finally, in a series of papers, Basak and Chabakauri [5], Czi- chowsky [10], B¨ojrket al. [7] looked at the mean variance problem which is also time inconsistent. This paper studies time-consistent solutions to the general discounting time- inconsistent stochastic optimal control problem (1)-(2). Specifically, we adopt a game theoretic approach to handle the time-inconsistency and we aim to charac- terize the open-loop Nash equilibrium controls. Different from most of the existing literature, in order to characterize the equilibrium controls, we begin by establish- ing a relationship between our time-inconsistent optimal control problem and a time-consistent problem, such that the equilibrium controls for the consistent prob- lem coincide with the equilibrium controls for the time-inconsistent problem. As a consequence of this, any optimal control of the time-consistent problem coincides with an equilibrium control of the time-inconsistent problem. By using the stan- dard approaches in classical stochastic control theory (i.e. the stochastic maximum principle (SMP) and the dynamic programming (DP)), we establish two general results which characterize the equilibrium controls: (i) The first one is a verification theorem associated to a standard HJB equation. (ii) The second result provides a complete characterization of the equilibrium controls via a necessary and sufficient condition under the form of a stochastic maximum principle. Finally, to illustrate our results, we discuss two concrete examples: (1) In the first example, we consider a consumption-investment problem with general discounting. We apply the verifica- tion theorem (Theorem 4.1) to derive the equilibrium consumption and investment strategies in state feedback forms. (2) In the second example, we discuss a general discounting LQ model. We apply the stochastic maximum principle in Theorem 4.2 to derive the Nash equilibrium solution in a linear feedback form via a standard Riccati equation. The rest of the paper is organized as follows. In the second section, we formu- late our problem and give necessary notations and preliminaries. In Section 3, we formulate the objective and present the main result of this work. In Section 4, we present some general results on equilibriums. Finally, in Section 5, we discuss two special cases.

2. Formulation of the problem. Throughout this paper (Ω, F, (Ft)t∈[0,T ] , P) is a filtered probability space such that F0 contains all P-null sets, FT = F for an arbitrarily fixed finite time horizon T > 0, and (Ft)t∈[0,T ] satisfies the usual conditions. We assume that (Ft)t∈[0,T ] is generated by a one-dimensional standard Brownian motion (W (t))t∈[0,T ] and an independent Poisson measure N on [0,T ]×Z where Z ⊆ R−{0}. We assume that the compensator of N has the form µ (dt, dz) = θ (dz) dt for some positive and σ−finite L´evymeasure on Z, endowed with its Borel R 2 ˜ σ−field B (Z). We suppose that Z 1 ∧ |z| θ (dz) < ∞ and write N (dt, dz) = N (dt, dz) − θ (dz) dt for the compensated jump martingale random measure of N. Obviously, we have 544 ISHAK ALIA

hRR i Ft = σ A×(0,s]N (dr, dz); s ≤ t, A ∈ B (Z) ∨ σ [W (s); s ≤ t] ∨ N , where N denotes the totality of θ-null sets, and σ1 ∨σ2 denotes the σ-field generated by σ1 ∪ σ2. In addition, we use the following notations:

1. For a function f, we denote by fx (resp. fxx) the gradient or Jacobian (resp. the Hessian) of f with respect to the variable x. 2. D [0,T ] = {(t, s) ∈ [0,T ] × [0,T ] , such that s ≥ t}. 3. Sn : the set of (n × n) symmetric matrices. n n 4. S− : the subset of all negative definite matrices of S . 5. L2 (Z, B (Z) , θ; Rn) : the space of functions r : Z → Rn such that r (·) is B (Z) −measurable, with Z 2 2 kr (·)k 2 n = |r (z)| θ (dz) < ∞. L (Z,B(Z),θ;R ) Z

2 n n 6. SF (t, T ; R ) : the space of R −valued, (Fs)s∈[t,T ]-adapted c`adl`agprocesses X (·), with " # 2 2 kX (·)k 2 n = sup |X (s)| < ∞. SF (t,T ;R ) E s∈[t,T ]

2 n n 7. LF (t, T ; R ) : the space of R −valued, (Fs)s∈[t,T ]-predictable processes Y (·), with "Z T # 2 2 kY (·)k 2 n = |Y (s)| ds < ∞. LF (t,T ;R ) E t

θ,2 n n 8. LF ([t, T ] × Z; R ) : the space of R −valued, (Fs)s∈[t,T ]-predictable process- es R (·, ·), with

"Z T Z # 2 2 kR (·, ·)k θ,2 n = E |R (s, z)| θ (dz) ds < ∞. LF ([t,T ]×Z;R ) t Z

9. C1,2 ([0,T ] × Rn; R) : the space of functions f : [0,T ] × Rn → R, such that f, ft, fx and fxx are continuous. Given a closed subset U ⊂ Rm, let b : [0,T ]×Rn ×U → Rn, σ : [0,T ]×Rn ×U → Rn and c : [0,T ] × Rn × U × Z → Rn be three deterministic measurable functions. We consider on the time interval [0,T ] the following controlled stochastic differential equation with random jumps (SDEJ),

 x0,u(·) x0,u(·)   dX (s) = b s, X (s) , u (s) ds  x0,u(·)   + σ s, X (s) , u (s) dW (s) Z (3) x0,u(·)  ˜  + c s, X (s−) , u (s−) , z N (ds, dz), s ∈ [0,T ],  Z  x0,u(·)  X (0) = x0, where u : [0,T ] × Ω → U represents the control process, Xx0,u(·) (·) is the controlled n state process and x0 ∈ R is regarded as the initial state. NON-EXPONENTIAL DISCOUNTING OPTIMAL CONTROL PROBLEMS 545

As time evolves, we need to consider the following controlled stochastic differen- tial equation starting from the situation (t, x) ∈ [0,T ] × Rn,   dX (s) = b (s, X (s) , u (s)) ds + σ (s, X (s) , u (s)) dW (s)  Z + c (s, X (s−) , u (s−) , z) N˜ (ds, dz), s ∈ [t, T ], (4)  Z  X (t) = x, where X (·) = Xt,x,u(·) (·) denotes its solution. For any initial state (t, x) ∈ [0,T ] × Rn, in order to measure the performance of a control process u (·), we introduce the following utility functional

"Z T # J (t, x; u (·)) = Et,x ν (t, s) f (s, u (s)) ds + ν (t, T ) h (X (T )) , (5) t where Et,x [·] is the conditional expectation given that the initial state of X (·) is x; f : [0,T ] × U → R and h : Rn → R are deterministic measurable functions; ν (·, ·): D [0,T ] → (0, ∞) is a continuous function which represents the discount function. Unlike [13], [14], [23] and [24] which are concerned with special forms of the discount function, here we consider the discount function in fairly general form. Moreover, this leads to the appearance of the initial time t in the local utility ν (t, s) f (s, u (s)) as well as in the terminal utility ν (t, T ) h (X (T )). As a conse- quence of this, the objective functional changes as time evolves: At time t ∈ [0,T ] hR T i we have, for example, the objective functional Et,x t ν (t, s) f (s, u (s)) ds which we want to maximize as a functional of u (·), but at a later time t + h we have hR T i the objective functional Et+h,X(t+h) t+h ν (t + h, s) f (s, u (s)) ds . This obviously leads to time-inconsistency, see e.g. [31], [13], [6], [36], [39] and references therein for more detailed discussion. We introduce the following assumptions. (H1) The maps b, σ and c are continuous and there exists a constant K > 0 such that for all (x, y) ∈ Rn × Rn and (t, u) ∈ [0,T ] × U, |b (t, x, u) − b (t, y, u)|2 + |σ (t, x, u) − σ (t, y, u)|2 Z + |c (t, x, u, z) − c (t, y, u, z)|2 θ (dz) ≤ K |x − y|2 Z and Z   |b (t, 0, u)|2 + |σ (t, 0, u)|2 + |c (t, 0, u, z)|2 θ (dz) ≤ K 1 + |u|2 . Z (H2) The maps f and h are quadratic growth on x and u uniformly in time, i.e. there exists a constant K > 0 such that for all x ∈ Rn and (t, u) ∈ [0,T ] × U,     |f (t, u)| ≤ K 1 + |u|2 and |h (x)| ≤ K 1 + |x|2 .

According to Lemma 2.1 in [25], under (H1)-(H2), for any initial pair (t, x) ∈ n 2 m [0,T ]×R and a control u (·) ∈ SF (t, T ; R ), the state equation (4) admits a unique t,x,u(.) 2 n solution X (·) = X (·) ∈ SF (t, T ; R ) and the utility functional J (t, x; u (·)) is well-defined. 546 ISHAK ALIA

Definition 2.1. An admissible control u (.) over [t, T ] is a U-valued (Fs)s∈[t,T ]- adapted c`adl`agprocess such that " # 4 E sup |u (s)| < ∞. s∈[t,T ] In the rest of this paper, we denote by U [t, T ] the set of all admissible controls over [t, T ]. Our stochastic optimal control problem can be stated as follows.

Problem (N). For any given initial pair (t, x) ∈ [0,T ]×Rn, find a u¯t,x (·) ∈ U [t, T ] such that J t, x;u ¯t,x (·) = sup J (t, x; u (·)) . u(·)∈U[t,T ]

Remark 1. For a given (t, x) ∈ [0,T ] × Rn, anyu ¯t,x (·) ∈ U [t, T ] satisfying the above is called a pre-commitment optimal control for Problem (N) at (t, x).

3. Equilibrium Control. Since Problem (N) is time-inconsistent in general, fol- lowing [31], [14] and [6], we adopt a game theoretic formulation of the problem and aim to find the corresponding Nash equilibrium strategies. Specifically, we describe the non-cooperative game associated to Problem (N) as follows (see [6]): 1. We consider a game with one player at each point t in time. This player represents the incarnation of the controller at time t and is referred to as “player t”. 2. The t-th player can control the system only at time t by taking his/her strategy u (t, ·):Ω → U. 3. A control process u (·) ∈ U [0,T ] is then viewed as a complete description of the chosen strategies of all players in the game. 4. The reward to player t is given by the functional J t, Xx0,u(·) (t); u (·). According to the above game perspective, the concept of the Nash equilibrium control uˆ (·) can be intuitively described as follows: (i)ˆu (·) ∈ U [0,T ]. (ii) Suppose that every player s, for s > t, will use the strategyu ˆ (s). Then the optimal choice for player t is that he/she also uses the strategyu ˆ (t). However, the problem with this “definition” is that the individual player t does not really influence the outcome of the game at all. He/she only chooses the control at the single point t, and since this is a time set of Lebesgue measure zero, the control dynamics as well as the functional J t, Xx0,u(·) (t); u (·) will not be influenced. Therefore, to characterize the Nash equilibrium control, we need to a more practical definition. In this paper, we follow Hu et al. [17] who provided the following definition of the so-called open-loop Nash equilibrium control. Given an admissible controlu ˆ (·) ∈ U [0,T ]. For any t ∈ [0,T ], v (·) ∈ U [0,T ] and for any ε ∈ [0,T − t), define  v (s) , for s ∈ [t, t + ε), ut,ε,v(·) (s) = (6) uˆ (s) , for s ∈ [t + ε, T ]. We have the following definition. Definition 3.1 (Nash Equilibrium Control). Letu ˆ (·) ∈ U [0,T ] be a given control and Xˆ x0 (·) = Xx0,uˆ(·) (·) be the state process corresponding tou ˆ (·) . The control NON-EXPONENTIAL DISCOUNTING OPTIMAL CONTROL PROBLEMS 547 uˆ (·) is called an open-loop Nash equilibrium control for Problem (N) if 1 n    o lim inf J t, Xˆ x0 (t) ;u ˆ (·) − J t, Xˆ x0 (t); ut,ε,v(·) (·) ≥ 0, (7) ε↓0 ε where ut,ε,v(·) (·) is defined by (6), for any t ∈ [0,T ] and v (·) ∈ U [0,T ]. In the rest of this paper, sometimes we simply callu ˆ (·) an equilibrium control instead of open-loop Nash equilibrium control when there is no ambiguity. Remark 2. In the above definition, the perturbation of the strategyu ˆ (·) in [t, t + ε) will not changeu ˆ (·) in [t + ε, T ], this is not the case with feedback strategies as in [14], [6] and [36]. 3.1. An equivalent time-consistent problem. In this subsection, we provide a surprising link between the time-inconsistent Problem (N) and a time-consistent control problem. Specifically, by using simple arguments, we prove that any ad- missible controlu ˆ (·) ∈ U [0,T ] is an equilibrium control for Problem (N), if and only if,u ˆ (·) is an equilibrium control to a standard time-consistent optimal control problem. For any (t, x, u (·)) ∈ [0,T ] × Rn × U [t, T ], we define the modified objective functional "Z T # ˜ ν (s, s) J (t, x; u (·)) = Et,x f (s, u (s)) ds + h (X (T )) , (8) t ν (s, T ) and we introduce the following stochastic optimal control problem.

Problem (C). For any given initial pair (t, x) ∈ [0,T ]×Rn, find a uˆt,x (·) ∈ U [t, T ] such that J˜t, x;u ˆt,x (·) = sup J˜(t, x; u (·)) = V˜ (t, x) . (9) u(·)∈U[t,T ]

For a given (t, x) ∈ [0,T ] × Rn, any admissible control satisfying (9) is called an optimal control for Problem (C) at (t, x). The function V˜ (·, ·) defined by (9) is called the value function of Problem (C). Remark 3. Note that J˜(t, x; u (·)) is in a standard form (i.e. the local utility ν(s,s) ν(s,T ) f (s, u (s)) as well as the terminal utility h (X (T )) do not depend on t). Con- sequently, Problem (C) is a time-consistent stochastic control problem. To establish the link between the time-inconsistent Problem (N) and the time- consistent Problem (C), we need to introduce the concept of the equilibrium control for Problem (C).

Definition 3.2. Letu ˆ (·) ∈ U [0,T ] be a given control and Xˆ x0 (·) be the state process corresponding tou ˆ (·) . The controlu ˆ (·) is called an open-loop equilibrium control for Problem (C) if 1 n    o lim inf J˜ t, Xˆ x0 (t) ;u ˆ (·) − J˜ t, Xˆ x0 (t); ut,ε,v(·) (·) ≥ 0, ε↓0 ε where ut,ε,v(·) (·) is defined by (6), for any t ∈ [0,T ] and v (·) ∈ U [0,T ]. The following theorem is the main result in this work; it ensures that Problem (C) and Problem (N) admit the same equilibrium controls. 548 ISHAK ALIA

Theorem 3.3. Let (H1)-(H2) hold. Then an admissible control uˆ (·) ∈ U [0,T ] is an equilibrium control for Problem (N), if and only if, uˆ (·) is an equilibrium control for Problem (C). To prove Theorem 3.3, we need some preliminary results given in the following two lemmas. Lemma 3.4. Let uˆ (·) ∈ U [0,T ] be an admissible control. Then uˆ (·) is an equilib- rium control for Problem (N), if and only if, for any t ∈ [0,T ] and v (·) ∈ U [0,T ], 1 n    o lim inf J¯ t, Xˆ x0 (t) ;u ˆ (·) − J¯ t, Xˆ x0 (t); ut,ε,v(·) (·) ≥ 0, (10) ε↓0 ε where for any (t, x, u (·)) ∈ [0,T ] × Rn × U [t, T ], 1 J¯(t, x; u (·)) = J (t, x; u (·)) ν (t, T ) " # Z T ν (t, s) = Et,x f (s, u (s)) ds + h (X (T )) . t ν (t, T ) Proof. Letu ˆ (·) ∈ U [0,T ] be an admissible control. For any t ∈ [0,T ] and v (·) ∈ U [0,T ], we have 1 n    o lim inf J¯ t, Xˆ x0 (t) ;u ˆ (·) − J¯ t, Xˆ x0 (t); ut,ε,v(·) (·) ε↓0 ε 1 1 n    o = lim inf J t, Xˆ x0 (t) ;u ˆ (·) − J t, Xˆ x0 (t); ut,ε,v(·) (·) . ν (t, T ) ε↓0 ε Thus, since ν (t, T ) > 0, it is not difficult to see thatu ˆ (·) is an equilibrium control for Problem (N), if and only if, (10) holds.

Lemma 3.5. Let (H1)-(H2) hold. Then the following equality holds 1 n    o lim inf J¯ t, Xˆ x0 (t) ;u ˆ (·) − J¯ t, Xˆ x0 (t); ut,ε,v(·) (·) ε↓0 ε 1 n    o = lim inf J˜ t, Xˆ x0 (t) ;u ˆ (·) − J˜ t, Xˆ x0 (t); ut,ε,v(·) (·) , (11) ε↓0 ε for any uˆ (·) ∈ U [0,T ], v (·) ∈ U [0,T ] and t ∈ [0,T ]. Proof. Letu ˆ (·) ∈ U [0,T ] be an admissible strategy. Consider the perturbed s- trategy ut,ε,v(·) (·) defined by the spike variation (6) for some fixed arbitrarily v (·) ∈ U [0,T ] , t ∈ [0,T ] and ε ∈ [0,T − t) . Let Xε (.) be the solution of the state equation corresponding to ut,ε,v(.) (.). Consider the difference n    o J¯ t, Xˆ x0 (t) ;u ˆ (·) − J¯ t, Xˆ x0 (t); ut,ε,v(·) (·) n    o − J˜ t, Xˆ x0 (t) ;u ˆ (·) − J˜ t, Xˆ x0 (t); ut,ε,v(·) (·) " # Z T      ν(t,s) t,ε,v(·) ˆ x0 ε = Et ν(t,T ) f (s, uˆ (s)) − f s, u (s) ds + h X (T ) − h (X (T )) t " # Z T      ν(s,s) t,ε,v(·) ˆ x0 ε − Et ν(s,T ) f (s, uˆ (s)) − f s, u (s) ds + h X (T ) − h (X (T )) t NON-EXPONENTIAL DISCOUNTING OPTIMAL CONTROL PROBLEMS 549

 T  Z  ν(t,s) ν(s,s)    t,ε,v(·)  = Et  ν(t,T ) − ν(s,T ) f (s, uˆ (s)) − f s, u (s) ds t  t+ε  Z  ν(t,s) ν(s,s)  = Et  ν(t,T ) − ν(s,T ) (f (s, uˆ (s)) − f (s, v (s))) ds , t h i ˆ x0 where Et [·] = Et ·| X (t) . Using the Cauchy-Schwarz inequality together with the quadratic growth condition, it follows that n    o ¯ ˆ x0 ¯ ˆ x0 t,ε,v(·) J t, X (t) ;u ˆ (·) − J t, X (t); u (·) n    o ˜ ˆ x0 ˜ ˆ x0 t,ε,v(·) − J t, X (t) ;u ˆ (·) − J t, X (t); u (·) 1 1  t+ε  2  t+ε  2 Z 2 Z h i ≤ ν(t,s) − ν(s,s) ds |f (s, v (s)) − f (s, uˆ (s))|2 ds  ν(t,T ) ν(s,T )   Et  t t 1 1  t+ε  2  t+ε  2 Z 2 Z h i ≤ ν(t,s) − ν(s,s) ds K 1 + |v (s)|4 + |uˆ (s)|4 ds ,  ν(t,T ) ν(s,T )   Et  t t for some constant K > 0. Now dividing both sides by ε and taking the limit when ε vanishes, we obtain that 1 n    o lim J¯ t, Xˆ x0 (t) ;u ˆ (·) − J¯ t, Xˆ x0 (t); ut,ε,v(·) (·) ε↓0 ε n    o ˜ ˆ x0 ˜ ˆ x0 t,ε,v(·) − J t, X (t) ;u ˆ (·) − J t, X (t); u (·) 1  t+ε  2 1 1 Z 2  1 Z t+ε h i  2 ≤ lim ν(t,s) − ν(s,s) ds K 1 + |v (s)|4 + |uˆ (s)|4 ds  ν(t,T ) ν(s,T )  Et ε↓0 ε ε t t 1  1 Z t+ε 2  2 ≤ lim ν(t,s) − ν(s,s) ds ν(t,T ) ν(s,T ) ε↓0 ε t 1  Z t+ε  2 1 h 4 4i K lim Et 1 + |v (s)| + |uˆ (s)| ds ε↓0 ε t 1  h i 2 = ν(t,t) − ν(t,t) K 1 + |v (t)|4 + |uˆ (t)|4 ν(t,T ) ν(t,T ) Et = 0, h i h i where the first equality is because ν(t,s) − ν(s,s) , |v (s)|4 and |uˆ (s)|4 ν(t,T ) ν(s,T ) Et Et are right-continuous functions of s. Accordingly, we have

1 n  x   x t,ε,v(·) o 0 = lim J¯ t, Xˆ 0 (t) ;u ˆ (·) − J¯ t, Xˆ 0 (t); u (·) ε↓0 ε

1 n  x   x t,ε,v(·) o − J˜ t, Xˆ 0 (t) ;u ˆ (·) − J˜ t, Xˆ 0 (t); u (·) , ε which leads to the desired result. 550 ISHAK ALIA

Remark 4. It seems that the Equality (11) in Lemma 3.5 does not hold if the local utility is state dependent (i.e., if we replace f (s, u (s)) by f (s, X (s) , u (s)) in the utility functional (5)). Proof of Theorem 3.3. Letu ˆ (·) ∈ U [0,T ] be an equilibrium control for Problem (N), then by Lemma 3.4 and Lemma 3.5, for any t ∈ [0,T ] and v (·) ∈ U [0,T ], we have 1 n    o 0 ≤ lim inf J¯ t, Xˆ x0 (t) ;u ˆ (·) − J¯ t, Xˆ x0 (t); ut,ε,v(·) (·) ε↓0 ε 1 n    o = lim inf J˜ t, Xˆ x0 (t) ;u ˆ (·) − J˜ t, Xˆ x0 (t); ut,ε,v(·) (·) . ε↓0 ε Henceu ˆ (·) is an equilibrium control for Problem (C). Conversely, suppose thatu ˆ (·) ∈ U [0,T ] is an equilibrium control for Problem (C), then for any t ∈ [0,T ] and v (·) ∈ U [0,T ], we have 1 n    o 0 ≤ lim inf J˜ t, Xˆ x0 (t) ;u ˆ (·) − J˜ t, Xˆ x0 (t); ut,ε,v(·) (·) . ε↓0 ε By Lemma 3.5, we get 1 n    o 0 ≤ lim inf J¯ t, Xˆ x0 (t) ;u ˆ (·) − J¯ t, Xˆ x0 (t); ut,ε,v(·) (·) . ε↓0 ε Therefore, it follows from Lemma 3.4 thatu ˆ (·) is an equilibrium control for Problem (N).

As a consequence of Theorem 3.3, we have the following result. Corollary 1. Let (H1)-(H2) hold. If uˆ (·) ∈ U [0,T ] is an optimal control for Problem (C) at (0, x0), i.e.

J˜(0, x0;u ˆ (·)) = sup J˜(0, x0; u (·)) . u(·)∈U[0,T ] Then uˆ (·) is an equilibrium control for Problem (N).

Proof. Suppose thatu ˆ (·) ∈ U [0,T ] is an optimal control for Problem (C) at (0, x0), then by Bellman’s principle of optimality, for any t ∈ [0,T ], v (·) ∈ U [0,T ] and ε ∈ [0,T − t), we have     J˜ t, Xˆ x0 (t) ;u ˆ (·) = sup J˜ t, Xˆ x0 (t); u (·) u(·)∈U[t,T ]   ≥ J˜ t, Xˆ x0 (t); ut,ε,v(·) (·) .

Accordingly, we have     J˜ t, Xˆ x0 (t) ;u ˆ (·) − J˜ t, Xˆ x0 (t); ut,ε,v(·) (·) ≥ 0.

Dividing both sides by ε and taking the limit when ε vanishes, we get 1 n    o lim inf J˜ t, Xˆ x0 (t) ;u ˆ (·) − J˜ t, Xˆ x0 (t); ut,ε,v(·) (·) ≥ 0, ε↓0 ε which means thatu ˆ (·) is an equilibrium control for Problem (C). Therefore, it follows from Theorem 3.3 thatu ˆ (·) is an equilibrium control for Problem (N). NON-EXPONENTIAL DISCOUNTING OPTIMAL CONTROL PROBLEMS 551

Remark 5. Note that, the corollary above is with very practical value. The reason is of course that in order to obtain an equilibrium control to the time-inconsistent Problem (N), it suffices to find an optimal solution to the standard time-consistent Problem (C). This is quite different from [6] (Proposition 5.1 p. 41) in which in order to formulate the equivalent time-consistent problem one needs to know the equilibrium control policy ˆu (·, ·).

4. Characterization of equilibriums. In this section, we present two indepen- dent results which characterize the equilibrium controls. The first one is a veri- fication theorem associated to a standard HJB equation and the second one is a stochastic Pontryagin’s maximum principle. 4.1. Verification theorem. In this subsection, we present a stochastic verification theorem which provides a sufficient condition for equilibrium controls of Problem (N). Define the generalized Hamiltonian function as a map from [0,T ] × Rn × U × C1 (Rn) × Rn × Sn into R by G (t, x, u, Ψ(·) , m, M) 1 ν (t, t) = hm, b (t, x, u)i + tr σσ> (t, x, u) M + f (t, u) 2 ν (t, T ) Z + {Ψ(x + c (t, x, u, z)) − Ψ(x) − hΨx (x) , c (t, x, u, z)i} θ (dz), Z and consider the Hamilton-Jacobi-Bellman equation associated to Problem (C) (see e.g. [27]):  Vt (t, x) + sup G (t, x, u, V (t, ·) ,Vx (t, x) ,Vxx (t, x)) = 0,  u∈U n (12) (t, x) ∈ [0,T ] × R ,   V (T, x) = h (x), x ∈ Rn. Theorem 4.1. Let (H1)-(H2) hold. Suppose that V (·, ·) ∈ C1,2 ([0,T ] × Rn; R)   is a classical solution to the HJB equation (12). If Xˆ x0 (·) , uˆ (·) is an admissible state-control pair such that      uˆ (t) ∈ arg max t, Xˆ x0 (t) , ·,V (t, ·) ,V t, Xˆ x0 (t) ,V t, Xˆ x0 (t) , G x xx (13) a.e. t ∈ [0,T ] . Then uˆ (·) is an equilibrium control for Problem (N). Proof. Suppose that V (·, ·) is a classical solution to the HJB equation (12). By the classical verification theorem for optimal controls associated to Problem (C) (see   e.g. [27], Theorem 4.1), if Xˆ x0 (·) , uˆ (·) is a state-control pair such that      ˆ x0 ˆ x0 ˆ x0 uˆ (t) ∈ arg max G t, X (t) , ·,V (t, ·) ,Vx t, X (t) ,Vxx t, X (t) , a.e. t ∈ [0,T ].

Then V (0, x0) = V˜ (0, x0) andu ˆ (·) is an optimal control for Problem (C) at (0, x0) . Thus, by Corollary1, ˆu (·) is an equilibrium control for Problem (N). Remark 6. This approach is direct and the derivation of the equilibrium control is not very complicated. Moreover, the equilibrium controlu ˆ (·) admits a closed-loop representation via (13). 552 ISHAK ALIA

4.2. Stochastic maximum principle. In this subsection, we present a necessary and sufficient condition for equilibrium controls. We derive this condition by using Theorem 3.3 and the second order Taylor’s expansion in the spike variation, in the same spirit of proving the stochastic Pontryagin’s maximum principle [32]. More- over, we point out that our approach is different from the one followed by Hu et al. ([17], [18]); the stochastic maximum principle here does not involve a family of BSDEs (parameterized by initial time t) as in [17]. Throughout this subsection, an admissible control u (·) is defined as a U-valued

(Fs)s∈[t,T ] −adapted c`adl`agprocess such that

1 " # 8 8 E sup |u (s)| < ∞. s∈[t,T ]

The set of admissible controls u (·) over [t, T ] is denoted by U0 [t, T ]. Obviously, we m 8 m have U0 [t, T ] ⊂ U [t, T ]. When U = R , we write SF (t, T ; R ) for U0 [t, T ]. x0 Letu ˆ (·) ∈ U0 [0,T ] be a fixed admissible control and Xˆ (·) be the state process corresponding tou ˆ (·). For some fixed arbitrary u ∈ U, we put for ϕ = b, σ:      ˆ x0 ˆ x0  ϕ (t) = ϕ t, X (t) , uˆ (t) , c (t, z) = c t, X (t) , uˆ (t) , z ,       ˆ x0 ˆ x0  ϕx (t) = ϕx t, X (t) , uˆ (t) , ϕxx (t) = ϕxx t, X (t) , uˆ (t) ,        δϕ (t; u) = ϕ t, Xˆ x0 (t) , u − ϕ t, Xˆ x0 (t) , uˆ (t) ,  δf (t; u) = f (t, u) − f (t, uˆ (t)) ,       ˆ x0 ˆ x0  cx (t, z) = cx t, X (t) , uˆ (t) , z , cxx (t, z) = cxx t, X (t) , uˆ (t) , z ,        δc (t, z; u) = c t, Xˆ x0 (t) , u, z − c t, Xˆ x0 (t) , uˆ (t) , z . The following assumption, imposed in Tang and Li [32], will be in force through- out this subsection. (H1*) The maps b, σ, c and h are twice continuously differentiable with respect to x. They and their derivatives in x are continuous in (x, u) . The functions |b (t, x, u)|, |σ (t, x, u)|, |hx (x)| and

1 Z  2k |c (t, x, u, z)|2k θ (dz) , k = 1, 2 Z

are bounded by (1 + |u| + |x|). The functions |bx (t, x, u)|, |bxx (t, x, u)|, |σx (t, x, u)|, |σxx (t, x, u)|, |hxx (x)| and Z Z 2k 2 |cx (t, x, u, z)| θ (dz) , k = 1, 2, |cxx (t, x, u, z)| θ (dz) Z Z are bounded. Define the Hamiltonian as a map from [0,T ]×Rn ×U ×Rn ×Rn ×L2 (Z, B (Z) , θ; Rn) into R by H (t, x, u, p, q, r (·)) = hb (t, x, u) , pi + hσ (t, x, u) , qi Z ν (t, t) + hc (t, x, u, z) , r (z)i θ (dz) + f (t, u), Z ν (t, T ) and let us introduce the adjoint equations involved in the stochastic maximum principle which characterize the equilibrium controls. NON-EXPONENTIAL DISCOUNTING OPTIMAL CONTROL PROBLEMS 553  The first order adjoint equation associated to the state-control pair Xˆ x0 (·) , uˆ (·)) is the following linear BSDE satisfied by the processes (p (·) , q (·) , r (·, ·)),

 n > > R > o  dp (t) = − bx (t) p (t) + σx (t) q (t) + cx (t, z) r (t, z) θ (dz) dt  Z + q (t) dW (t) + R r (t, z) N˜ (dt, dz), t ∈ [0,T ], (14)   Z  ˆ x0  p (T ) = hx X (T ) ,   and the second order adjoint equation associated to Xˆ x0 (·) , uˆ (·) is the following BSDE satisfied by the matrix-valued processes (P (·) , Λ(·) , Γ(·, ·)),

 n > >  dP (t) = − bx (t) P (t) + P (t) bx (t) + σx (t) P (t) σx (t)   >  + σx (t) Λ(t) + Λ (t) σx (t)  Z  + c (t, z)> (Γ (t, z) + P (t)) c (t, z) θ (dz)  x x  ZZ Z  > + cx (t, z) Γ(t, z) θ (dz) + Γ(t, z) cx (t, z) θ (dz) (15)  Z  Z o  ˆ x0  + Hxx t, X (t) , uˆ (t) , p (t) , q (t) , r (t, .) dt  Z   + Λ (t) dW (t) + Γ(t, z) N˜ (dt, dz), t ∈ [0,T ],     Z  x0  P (T ) = hxx Xˆ (T ) .

Under (H1*)-(H2) the above BSDEs are uniquely solvable in (p (·) , q (·) , r (·, ·)) 2 n 2 n θ,2 n ∈ SF (0,T ; R ) × LF (0,T ; R ) × LF ([0,T ] × Z; R ) and (P (·) , Λ(·) , Γ(·, ·)) ∈ 2 n 2 n θ,2 n SF (0,T ; S ) × LF (0,T ; S ) × LF ([0,T ] × Z; S ) , respectively; see e.g. [32]. Proposition 1. Let (H1*)-(H2) hold. Then the following equality holds     J˜ t, Xˆ x0 (t) ;u ˆ (·) − J˜ t, Xˆ x0 (t); ut,ε,v(·) (·)

t+ε Z  1  = − δ (s; v (s)) + δσ (s; v (s))> P (s) δσ (s; v (s)) ds (16) Et H 2 t t+ε Z Z 1 h > i − Et δc (s, z; v (s)) (Γ (s, z) + P (s)) δc (s, z; v (s)) θ (dz) ds + o (ε) , 2 Z t where

> > δH (s; v (s)) = δb (s; v (s)) p (s) + δσ (s; v (s)) q (s) Z ν (s, s) + δc (s, z; v (s))> r (s, z) θ (dz) + δf (s; v (s)) . Z ν (s, T ) Proof. The proof can be adapted from [32] (step 2, pp. 1463-1467). We omit it.

Different from Theorem 4.1, the following theorem provides a complete charac- terization of the equilibrium controls.

Theorem 4.2 (Stochastic maximum principle). Let (H1*)-(H2) hold. Given an admissible control uˆ (·) ∈ U0 [0,T ], let (p (·) , q (·) , r (·, ·)) and (P (·) , Λ(·) , Γ(·, ·)) 554 ISHAK ALIA be the unique solutions to the BSDEs (14) and (15) , respectively. Then uˆ (·) is an equilibrium control for Problem (N), if and only if, the following condition holds,

 x   x  0 ≥ H t, Xˆ 0 (t) , u, p (t) , q (t) , r (t, ·) − H t, Xˆ 0 (t) , uˆ (t) , p (t) , q (t) , r (t, ·) 1Z + δc (t, z; u)> (Γ (t, z) + P (t)) δc (t, z; u) θ (dz) (17) 2 Z 1 + δσ (t; u)> P (t) δσ (t; u) , −a.s., ∀u ∈ U, a.e. t ∈ [0,T ] . 2 P

Proof. Suppose thatu ˆ (·) ∈ U0 [0,T ] is an equilibrium control for Problem (N), then by Lemma 3.4 and Lemma 3.5 together with Proposition1, we have 1 n    o 0 ≤ lim inf J¯ t, Xˆ x0 (t) , uˆ (·) − J¯ t, Xˆ x0 (t) , ut,ε,v(·) (·) ε↓0 ε 1 n    o = lim inf J˜ t, Xˆ x0 (t) , uˆ (·) − J˜ t, Xˆ x0 (t) , ut,ε,v(·) (·) ε↓0 ε Z t+ε   1 1 > = lim inf Et −δH (s; v (s)) − δσ (s; v (s)) P (s) δσ (s; v (s)) ds ε↓0 ε t 2 Z t+εZ  1 h > i − Et δc (s, z; v (s)) (Γ (s, z) + P (s)) δc (s, z; v (s)) θ (dz) ds , 2 t Z for any t ∈ [0,T ] any v (·) ∈ U0 [0,T ]. Applying Lemma 3.5 in [19], we get 1 0 ≤ −δ (t; v (t)) − δσ (t; v (t))> P (t) δσ (t; v (t)) H 2 Z 1 > − δc (t, z; v (t)) (Γ (t, z) + P (s)) δc (t, z; v (t)) θ (dz) , P−a.s., a.e. t ∈ [0,T ] . 2 Z Thus, by setting v (t) ≡ u for an arbitrarily u ∈ U we get (17). Conversely, suppose thatu ˆ (·) is an admissible control for which the variational inequality (17) holds, then for any t ∈ [0,T ], v (·) ∈ U0 [0,T ] and ε ∈ [0,T − t), we have Z t+ε   1 > 0 ≤ Et −δH (s; v (s)) − δσ (s; v (s)) P (s) δσ (s; v (s)) ds t 2 Z t+εZ 1 h > i − Et δc (s, z; v (s)) (Γ (s, z) + P (s)) δc (s, z; v (s)) θ (dz) ds. 2 t Z Now dividing both sides by ε and taking the limit when ε vanishes, Z t+ε   1 1 > 0 ≤ lim inf Et −δH (s; v (s)) − δσ (s; v (s)) P (s) δσ (s; v (s)) ds ε↓0 ε t 2 Z t+εZ  1 h > i − Et δc (s, z; v (s)) (Γ (s, z) + P (s)) δc (s, z; v (s)) θ (dz) ds 2 t Z 1 n    o = lim inf J˜ t, Xˆ x0 (t) ;u ˆ (·) − J˜ t, Xˆ x0 (t); ut,ε,v(·) (·) ε↓0 ε 1 n    o = lim inf J¯ t, Xˆ x0 (t) ;u ˆ (·) − J¯ t, Xˆ x0 (t); ut,ε,v(·) (·) . ε↓0 ε Hence, by Lemma 3.4,u ˆ (·) is an equilibrium control for Problem (N). This com- pletes the proof. NON-EXPONENTIAL DISCOUNTING OPTIMAL CONTROL PROBLEMS 555  Remark 7. Define an H-function associated with uˆ (·) , Xˆ x0 (·) , p (·) , q (·) , r (·, ·) , P (·) , Γ(·, ·)) as follows, H (t, x, u) = H (t, x, u, p (t) , q (t) , r (t, ·)) 1   >    + σ (t, x, u) − σ t, Xˆ x0 (t) , uˆ (t) P (t) σ (t, x, u) − σ t, Xˆ x0 (t) , uˆ (t) 2 1Z   > + c (t, x, u, z) − c t, Xˆ x0 (t) , uˆ (t) , z (Γ (t, z) + P (t)) (18) 2 Z   x  n × c (t, x, u, z) − c t, Xˆ 0 (t) , uˆ (t) , z θ (dz),(t, x, u) ∈ [0,T ] × R × U. Then easy manipulations show that the variational inequality (17) is equivalent to

 x   x  H t, Xˆ 0 (t) , uˆ (t) = max H t, Xˆ 0 (t) , u , P−a.s., a.e. t ∈ [0,T ] . (19) u∈U Remark 8. On comparing Theorem 4.2 with Theorem 4.1, we find the following facts: (i) The advantage: The stochastic maximum principle in Theorem 4.2 provides a necessary and sufficient equilibrium condition, while the verification argu- ment in Theorem 4.1 provides only a sufficient condition to characterize the equilibrium controls.

(ii) The disadvantage: Assumption (H1*) is much strong which makes it hard to apply the stochastic maximum principle in the practice. 4.2.1. Sufficient Conditions for Optimality of Equilibrium Controls. The Corollary 1 ensures that, under Assumptions (H1)-(H2), ifu ˆ (·) is an optimal solution of Problem (C) thenu ˆ (·) is an equilibrium control for Problem (N). One will naturally ask whether an equilibrium control of Problem (N) is optimal for Problem (C). In this paragraph, we focus on proving that, provided some concavity assumptions are satisfied, any Nash equilibrium control for Problem (N) is indeed optimal for Problem (C). Let us first introduce two additional assumptions. (H3) The control domain U is convex. The map h is concave with respect to x and the Hamiltonian function H is concave with respect to (x, u) . (H4) The maps b, σ, c and f are continuously differentiable with respect to u. From Assumption (H4), we can see that the Hamiltonian H as well as the H-   function associated with the 7-tuple uˆ (·) , Xˆ x0 (·) , p (·) , q (·) , r (·, ·) ,P (·) , Γ(·, ·) are also continuously differentiable with respect to u. Moreover, it is not difficult to verify that     ˆ x0 ˆ x0 Hu t, X (t) , uˆ (t) = Hu t, X (t) , uˆ (t) , p (t) , q (t) , r (t, ·) , P−a.s., a.e. t ∈ [0,T ]. The following theorem can be seen as the converse of Corollary1 under As- sumptions (H1*)-(H4); its proof follows an argument adapted from the proof of Theorem 4.2 in [30].

Theorem 4.3. Let (H1*)-(H4) hold. If uˆ (·) ∈ U0 [0,T ] is an equilibrium control for Problem (N), then uˆ (·) is an optimal control for Problem (C) at (0, x0). 556 ISHAK ALIA

Proof. Suppose thatu ˆ (·) is an equilibrium control for Problem (N) and let (p (·) , q (·) , r (·, ·)) be the unique solution to the BSDE (14). By virtue of Remark7 we have

 x   x  H t, Xˆ 0 (t) , uˆ (t) = max H t, Xˆ 0 (t) , u , P−a.s., a.e. t ∈ [0,T ]. u∈U

Since U is convex and the H-function is continuously differentiable with respect to u, the first order Euler condition gives   > x0 0 ≤ (ˆu (t) − u) Hu t, Xˆ (t) , uˆ (t)

>   = (ˆu (t) − u) t, Xˆ x0 (t) , uˆ (t) , p (t) , q (t) , r (t, ·) , Hu (20) P−a.s., ∀u ∈ U, a.e. t ∈ [0,T ].

Now for any control process v (·) ∈ U0 [0,T ] and the corresponding controlled state process Xx0,v(·) (.), consider the difference

J˜(0, x0;u ˆ (·)) − J˜(0, x0; v (·)) "Z T # ν (s, s)  x   x ,v(·)  = E {f (s, uˆ (s)) − f (s, v (s))} ds + h Xˆ 0 (T ) − h X 0 (T ) . 0 ν (s, T )

By the concavity of h (·), we have

h    i  >   ˆ x0 x0,v(·) ˆ x0 x0,v(·) ˆ x0 E h X (T ) − h X (T ) ≥ E X (T ) − X (T ) hx X (T ) .

Accordingly, by the terminal condition in the BSDE (14) we obtain that

"Z T ˜ ˜ ν (s, s) J (0, x0;u ˆ (·)) − J (0, x0; v (·)) ≥ E {f (s, uˆ (s)) − f (s, v (s))} ds 0 ν (s, T )  >  + Xˆ x0 (T ) − Xx0,v(·) (T ) p (T ) . (21)

 > From Itˆo’slemma, applied to s 7→ Xˆ x0 (s) − Xx0,v(·) (s) p (s), it follows that

 >   x x ,v(·)  E Xˆ 0 (T ) − X 0 (T ) p (T )

"Z T  >   x   x ,v(·)  = E b s, Xˆ 0 (s) , uˆ (s) − b s, X 0 (T ) , v (s) p (s) 0     > + σ s, Xˆ x0 (s) , uˆ (s) − σ s, Xx0,v(·) (s) , v (s) q (s) (22) Z     > + c s, Xˆ x0 (s) , uˆ (s) , z − c s, Xx0,v(·) (s) , v (s) , z r (s, z) θ (dz) Z  >    ˆ x0 x0,v(·) ˆ x0 − X (s) − X (s) Hx s, X (s) , uˆ (s) , p (s) , q (s) , r (s, ·) ds .

On the other hand, by the definition of the Hamiltonian H, we have NON-EXPONENTIAL DISCOUNTING OPTIMAL CONTROL PROBLEMS 557 " # Z T ν (s, s) E {f (s, uˆ (s)) − f (s, v (s))} ds 0 ν (s, T ) "Z T n  x  = E H s, Xˆ 0 (s) , uˆ (s) , p (s) , q (s) , r (s, ·) 0

 x ,v(·)  − H s, X 0 (s) , v (s) , p (s) , q (s) , r (s, ·) (23)     > − b s, Xˆ x0 (s) , uˆ (s) − b s, Xx0,v(·) (s) , v (s) p (s)     > − σ s, Xˆ x0 (s) , uˆ (s) − σ s, Xx0,v(·) (s) , v (s) q (s) Z     >   − c s, Xˆ x0 (s) , uˆ (s) , z − c s, Xx0,v(·) (s) , v (s) , z r (s, z) θ (dz) ds . Z Invoking (22) and (23) into (21), we obtain that

J˜(0, x0;u ˆ (·)) − J˜(0, x0; v (·)) "Z T n  x  ≥ E H s, Xˆ 0 (s) , uˆ (s) , p (s) , q (s) , r (s, ·) 0

 x ,v(·)  − H s, X 0 (s) , v (s) , p (s) , q (s) , r (s, ·) (24)  >    ˆ x0 x0,v(·) ˆ x0 − X (s) − X (s) Hx s, X (s) , uˆ (s) , p (s) , q (s) , r (s, ·) ds .

By the concavity of the Hamiltonian H, we have  x  H s, Xˆ 0 (s) , uˆ (s) , p (s) , q (s) , r (s, ·)

 x ,v(·)  − H s, X 0 (s) , v (s) , p (s) , q (s) , r (s, ·)  >   ˆ x0 x0,v(·) ˆ x0 ≥ X (s) − X (s) Hx s, X (s) , uˆ (s) , p (s) , q (s) , r (s, ·)   > ˆ x0 + (ˆu (s) − u) Hu s, X (s) , uˆ (s) , p (s) , q (s) , r (s, ·) . (25) Combining (24)-(25) together with (20), it follows that

J˜(0, x0;u ˆ (·)) − J˜(0, x0; v (·)) " # Z T   > ˆ x0 ≥ E (ˆu (s) − u) Hu s, X (s) , uˆ (s) , p (s) , q (s) , r (s, ·) ds 0 ≥ 0, which means thatu ˆ (·) is an optimal control for Problem (C) at (0, x0).

The following result is comparable with Theorem 3.3.

Theorem 4.4. Let (H1*)-(H4) hold. Then an admissible control uˆ (·) ∈ U0 [0,T ] is an equilibrium control for Problem (N), if and only if, uˆ (·) is an optimal control for Problem (C) at (0, x0). Proof. The proof can be directly obtained by combining Corollary1 and Theorem 4.3. 558 ISHAK ALIA

Remark 9. (i) Note that, Theorem 4.4 provides a stronger link between Prob- lem (N) and Problem (C) than the one ensured by Theorem 3.3. Moreover, an interesting consequence of Theorem 4.4 is that, under Assumptions (H1*)-(H4), if Problem (C) admits a unique optimal controlu ˆ (·), thenu ˆ (·) is the unique equi- librium control for Problem (N). (ii) The disadvantage of Theorem 4.4 it that Assumptions (H1*)-(H4) are much strong, which limits the scope of problems applicable to the theorem.

5. Some applications. In this section, we discuss two special cases of time- inconsistent stochastic control problems to illustrate our results.

5.1. Generalized Merton portfolio problem. We consider a consumption and investment problem associated to a jump-diffusion market and a non-exponential discounted utility. We apply the verification argument in Theorem 4.1 to derive the equilibrium consumption and investment strategies in state feedback forms. Note that, in the absence of Poisson random jumps, this problem was considered by Yong [36] in which the equilibrium is, however, defined within the class of closed-loop controls (see [36], Definition 4.1). Suppose that there is a financial market, in which two securities are traded continuously. One of them is a bond, with price S0 (s) at time s ∈ [0,T ] governed by dS0 (s) = r (s) ds, S0 (0) = p0 > 0, S0 (s) where r : [0,T ] → (0, +∞) is a deterministic function which represents the risk-free rate. The other asset is called the risky stock, whose price process S1 (·) satisfies the following stochastic differential equation Z dS1 (s) = µ (s) ds + σ (s) dW (s) + φ (s, z) N˜ (ds, dz) ,S1 (0) = p1 > 0, S1 (s) Z where µ : [0,T ] → (0, ∞), σ : [0,T ] → (0, ∞) and φ : [0,T ] × Z → R are determinis- tic measurable functions; µ (·), σ (·) and φ (·, ·) represent the appreciation rate, the volatility and the jump coefficient of the risky stock, respectively. We assume that r (·), µ (·), σ (·) and φ (·, z) (for every z ∈ Z) are continuous functions of s ∈ [0,T ], such that µ (s) > r (s), φ (s, z) ≥ −1, for a.e. s ∈ [0,T ] , Z θ−a.e. z ∈ Z, and sup |φ (s, z)|2 θ (dz) < +∞. We also require a nondegenera- s∈[0,T ] Z cy condition as follows: Z σ (s)2 + φ (s, z)2 θ (dz) ≥ , for a.e. s ∈ [0,T ], Z for some  > 0. Starting from an initial capital x0 > 0 at time 0, during the time horizon [0,T ], the decision maker is allowed to dynamically investing in the stock as well as in the bond, and consuming. A consumption-investment strategy is described by a two-dimensional stochastic process u (·) = (c (·) , π (·)), where c (s) ≥ 0 represents the consumption rate at time s ∈ [0,T ] and π (s) ∈ [0, 1] represents the fraction of the agent’s wealth allocated to the risky stock at time s ∈ [0,T ]. Clearly, the dollar amount invested in the bond at time s is Xx0,c(·),π(·) (s) (1 − π (s)), where Xx0,c(·),π(·) (·) (which is required to be non-negative) is the wealth process asso- ciated with the strategy (c (·) , π (·)) and the initial capital x0. The evolution of NON-EXPONENTIAL DISCOUNTING OPTIMAL CONTROL PROBLEMS 559

Xx0,c(·),π(·) (·) can be described as  dS0 (s)  x0,c(·),π(·) x0,c(·),π(·)  dX (s) = X (s) (1 − π (s)) − c (s) ds  S0 (s) dS1 (s) (26) + Xx0,c(·),π(·) (s−) π (s−) , for s ∈ [0,T ],   S1 (s)  x0,c(·),π(·)  X (0) = x0. Accordingly, the wealth process solves the following SDEJ

 x0,c(·),π(·)  x0,c(·),π(·)  dX (s) = X (s)(r (s) + π (s)(µ (s) − r (s))) − c (s) ds  x0,c(·),π(·)  + X (s) π (s) σ (s) dW (s)  Z + Xx0,c(·),π(·) (s−) π (s−) φ (s, z) N˜ (ds, dz),  Z  for s ∈ [0,T ],   X (0) = x0. (27) When time evolves, we consider the SDEJ parameterized by (t, x), and satisfied by X (·) = Xt,x,c(·),π(·) (·),  dX (s) = {X (s)(r (s) + π (s)(µ (s) − r (s))) − c (s)} ds   + X (s) π (s) σ (s) dW (s)  Z ˜ (28)  + X (s−) π (s−) φ (s, z) N (ds, dz) , for s ∈ [t, T ],  Z  X (t) = x. A trading strategy u (·) = (c (·) , π (·)) is said be admissible over [t, T ] , if it is a 2 R -valued (Fs)s∈[t,T ] −adapted c`adl`agprocess such that: " # 4 (i) E sup |c (s)| ; s∈[t,T ] (ii) c (s) ≥ 0, π (s) ∈ [0, 1] , P−a.s., a.e. s ∈ [t, T ]. We denotes by C [t, T ] × Π[t, T ] the set of all admissible strategies over [t, T ]. In order to evaluate the performance of a consumption-investment strategy, the decision maker deriving utility from intertemporal consumption and final wealth. Specifically, the investment-consumption optimization problem is: " # Z T c (s)β X (T )β Maximize J (t, x; c (·) , π (·)) = Et,x ν (t, s) ds + ν (t, T ) , (29) t β β over the set of admissible controls C [t, T ]×Π[t, T ], where ν (·, ·): D [0,T ] → (0, +∞) is a fairly general discount function and β ∈ (0, 1). Particularly, the equivalent time-consistent optimal control problem can be de- fined as follows: "Z T β β # ˜ ν (s, s) c (s) X (T ) Maximize J (t, x; c (·) , π (·)) = Et,x ds + , (30) t ν (s, T ) β β over (c (·) , π (·)) ∈ C [t, T ] × Π[t, T ]. 5.1.1. Equilibrium solution. In the next, we apply the verification argument in The- orem 4.1 to derive the equilibrium strategy. Specifically, we use the standard dy- namic programming approach to compute an optimal solution for the equivalent time-consistent problem (30). This optimal solution coincides with an equilibrium strategy for the time-inconsistent consumption-investment problem (29). 560 ISHAK ALIA

Before providing the precise statement of the main result in this subsection, let us introduce the function F : [0,T ] × [0, 1] → R given by 1 F (t, π) = β (µ (t) − r (t)) π + (σ (t) π)2 β (β − 1) 2 Z n o + (1 + φ (t, z) π)β − 1 − βφ (t, z) π θ (dz) , (31) Z and note that 2 Fπ (t, π) = β (µ (t) − r (t)) + σ (t) β (β − 1) π Z n o + φ (t, z) β (1 + πφ (t, z))β−1 − 1 θ (dz) , (32) Z  Z  2 2 β−2 Fππ (t, π) = (β − 1) β σ (t) + φ (t, z) (1 + πφ (t, z)) θ (dz) , Z where Fπ (·, ·) and Fππ (·, ·) denote, respectively, the first and second order deriva- tives of F (·, ·) with respect to π. Since β ∈ (0, 1), it is not difficult to see that the function F (t, π) is strictly concave with respect to π. Theorem 5.1. The time-inconsistent portfolio problem in (29) has an equilibrium consumption-investment strategy that can be represented by

1 ν (t, T )  β−1 cˆ(t) = ξ (t) Xˆ x0 (t) , t ∈ [0,T ] , (33) ν (t, t) and  1, if Fπ (t, 1) ≥ 0, πˆ (t) = ∗ (34) π (t) , if Fπ (t, 1) < 0, t ∈ [0,T ] , ∗ where π (t) denotes the unique solution of Fπ (t, π) = 0 in (0, 1), ξ (·) is given by 1−β T 1 ! T Z   β−1 R T βr(l)+F (l,πˆ(l)) R {βr(τ)+F (τ,πˆ(τ))}dτ ν (τ, T ) − dl ξ (t) = e t 1 + e τ (1−β) dτ t ν (τ, τ) (35) and the equilibrium wealth process Xˆ x0 (·) is given by

 t  1 Z   β−1 ˆ x0 ν(τ,T ) X (t) = x0 exp r (τ) +π ˆ (τ)(µ (τ) − r (τ)) − ν(τ,τ) ξ (τ) 0 1  Z t − (ˆπ (τ) σ (τ))2 dτ + πˆ (τ) σ (τ) dW (τ) 2 0 Z t Z + {ln (1 +π ˆ (τ) φ (τ, z)) − πˆ (τ) φ (τ, z)} θ (dz) dτ 0 Z Z t Z  + {ln (1 +π ˆ (τ) φ (τ, z))} Ne (dτ, dz) , for t ∈ [0,T ] . (36) 0 Z Proof. Assume for the time being that the conditions of Theorem 4.1 hold. The generalized Hamiltonian function G associated to the this problem is 1 2 G (t, x, c, π, Ψ(·) , m, M) = m (r (t) x + (µ (t) − r (t)) xπ − c) + 2 (σ (t) xπ) M Z ν (t, t) cβ + {Ψ(x + xπφ (t, z)) − Ψ(x) − Ψx (x) xπφ (t, z)} θ (dz) + . Z ν (t, T ) β NON-EXPONENTIAL DISCOUNTING OPTIMAL CONTROL PROBLEMS 561

Accordingly, the HJB equation takes the form   ν (t, t) cβ  0 = V (t, x) + V (t, x) r (t) x + sup − V (t, x) c  t x x  c≥0, π∈[0,1] ν (t, T ) β  1 2  + Vx (t, x)(µ (t) − r (t)) xπ + 2 (σ (t) xπ) Vxx (t, x) Z  (37)  + {V (t, x + xπφ (t, z)) − V (t, x) − Vx (t, x) xπφ (t, z)} θ (dz) ,  Z  β  x  V (T, x) = . β Because of the terminal condition, we consider an ansatz of the form: xβ V (t, x) = ξ (t) , β for some deterministic function ξ (·) ∈ C1 ([0,T ] , R) such that ξ (T ) = 1. Accord- ingly, the partial derivatives are dξ (t) xβ V (t, x) = , V (t, x) = ξ (t) xβ−1, V (t, x) = ξ (t)(β − 1) xβ−2. t dt β x xx Thus, by substituting V (t, x) and the above derivatives into (37), this leads to

β dξ (t) x β n ν(t,t) cβ β−1 0 = + ξ (t) r (t) x + sup ν(t,T ) β − ξ (t) x c dt β c≥0, π∈[0,1] β 1 2 β (38) +ξ (t)(µ (t) − r (t)) πx + 2 (σ (t) π) ξ (t)(β − 1) x 1 Z n o  + ξ (t) xβ (1 + φ (t, z) π)β − 1 − βφ (t, z) π θ (dz) . β Z Note that the optimization problem in (38) breaks down into two independent optimization problems and its solution can be obtained in a sequential way. We start by optimizing the Equation (38) with respect to c, before proceeding to optimize with respect to the variable π. Since β ∈ (0, 1), the quantity to be maximized in (38) is strictly concave with respect to the control variable c. Indeed, the first order condition associated with the optimization problem above provides a maximizerc ˆ(t, x), which is given by

1 ν (t, T )  β−1 cˆ(t, x) = ξ (t) x, ν (t, t) xβ invoking this into Equation (38) and factoring out the term , we obtain that β

 1   β−1  dξ (t) ν (t, T ) β  0 = + βξ (t) r (t) + (1 − β) ξ (t) β−1  dt ν (t, t) + ξ (t) sup F (t, π), (39)   π∈[0,1]  ξ (T ) = 1, where F (t, π) is as introduced in (31). Taking into account the constraint π ∈ [0, 1] and the strict concavity of F (t, π), we conclude that the maximization problem in (39) has a unique solutionπ ˆ (t). Moreover, from the definition of the function F (t, π), and using simple arguments such as the intermediate value theorem it is possible to check that: 562 ISHAK ALIA

(i) if Fπ (t, 1) < 0 and Fπ (t, 0) = β (µ (t) − r (t)) > 0, then there exists a unique ∗ ∗ interior point π (t) ∈ (0, 1) such that Fπ t, π (t) = 0 and, consequently, πˆ (t) = π (t)∗ ; (ii) if Fπ (t, 1) ≥ 0 and Fπ (t, 0) = β (µ (t) − r (t)) > 0, then F (t, π) attains its maximum at the boundary point 1, and consequentlyπ ˆ (t) = 1.

Now, we turn our attention to the ODE in (5.14). Replacing π byπ ˆ (t) into Equation (39), this leads to

 1 dξ (t) ν (t, T ) β−1 β  + ξ (t) {βr (t) + F (t, πˆ (t))} + (1 − β) ξ (t) β−1 = 0, dt ν (t, t)  ξ (T ) = 1. This is a Bernoulli equation. To solve it, let

ξ (t) = y (t)(1−β) , for t ∈ [0,T ]. We find that y (·) should satisfy the following linear ODE,

 1 dy (t) βr (t) + F (t, πˆ (t)) ν (t, T ) β−1  + y (t) + = 0, for t ∈ [0,T ], dt (1 − β) ν (t, t)  y (T ) = 1. A variation of constant formula yields,

1 ! Z T   β−1 R T βr(τ)+F (τ,πˆ(τ)) dτ ν (τ, T ) − R T βr(l)+F (l,πˆ(l)) dl y (t) = e t 1−β 1 + e τ (1−β) dτ , t ν (τ, τ) for t ∈ [0,T ], which leads to (35). According to the above derivatives and Theorem 4.1, the equilibrium consumption-investment solution is given by (33)−(34). Moreover, the correspond- ing wealth process solves the SDEJ

  1   ν(s,T )  β−1  dXˆ x0 (s) = Xˆ x0 (s) r (s) +π ˆ (s)(µ (s) − r (s)) − ξ (s) ds  ν(s,s)   Z + Xˆ x0 (s)π ˆ (s) σ (s) dW (s) + Xˆ x0 (s−)π ˆ (s) φ (s, z) N˜ (ds, dz),  Z  for s ∈ [0,T ],   x0  Xˆ (0) = x0. By argument of Itˆo’sformula and a change of variable (see [27], Example 1.15, pp. 7-8), we obtain that Xˆ (·) is given by (36) .

Remark 10. Time-consistent solution for the non-exponential discounting Mer- ton’s portfolio problem has been well explored using different methods and different concepts of Nash equilibriums. Please see [6], [23], [15] for the extended HJB equa- tions method, [14], [2] for the duality method, and [36] for a multi person differential game approach. Although the existing literature has provided mathematically ele- gant results, the focus is still on financial models without jumps. As far as we know, our paper is the first to find the equilibrium consumption-investment solution under a jump–diffusion model. NON-EXPONENTIAL DISCOUNTING OPTIMAL CONTROL PROBLEMS 563

Corollary 2. In the case of the classical form of the discount function (i.e. ν (t, s) = e−δ(s−t), for every (t, s) ∈ D [0,T ]), the equilibrium consumption and investment strategies are given by

1   β−1 −δ(T −t) x0 cˆ(t) = e ξ0 (t) Xˆ (t) , and  1, if Fπ (t, 1) ≥ 0, πˆ (t) = ∗ π (t) , if Fπ (t, 1) < 0, where 1−β T 1 ! T Z R T βr(l)+F (l,πˆ(l)) R {βr(τ)+F (τ,πˆ(τ))}dτ  −δ(T −τ) β−1 − dl ξ0 (t) = e t 1 + e e τ (1−β) dτ , t and  t  1 Z   β−1 x0 −δ(T −τ) Xˆ (t) = x0 exp r (τ) +π ˆ (τ)(µ (τ) − r (τ)) − e ξ0 (τ) 0 1  Z t − (ˆπ (τ) σ (τ))2 dτ + πˆ (τ) σ (τ) dW (τ) 2 0 Z t Z + {ln (1 +π ˆ (τ) φ (τ, z)) − πˆ (τ) φ (τ, z)} θ (dz) dτ 0 Z Z t Z  + {ln (1 +π ˆ (τ) φ (τ, z))} Ne (dτ, dz) , for t ∈ [0,T ] . 0 Z Remark 11. The equilibrium consumption-investment strategy (ˆc (·) , πˆ (·)), pre- sented in Corollary2, coincides with the optimal solution of classical Merton port- folio problem under jump-diffusion model (see. e.g. [3], the case without regime switching). This confirms the well-known fact that the equilibrium strategy for ex- ponential discount function is nothing but the optimal strategy. A relevant remark is that the investment strategyπ ˆ (t) is independent of the discount function, and it is the same for the non-exponential discount function. The case without jumps. If the modeling framework is without jumps, then the function F (t, π) and its first and second order derivatives reduce to 1 g (t, π) = β (µ (t) − r (t)) π + (σ (t) π)2 β (β − 1) , 2 2 gπ (t, π) = β (µ (t) − r (t)) + σ (t) πβ (β − 1) , and 2 gππ (t, π) = σ (t) β (β − 1) , respectively. Moreover the equilibrium consumption-investment strategy given by (33) − (34) reduces to

1 ν (t, T )  β−1 cˆ(t) = ξ˜(t) Xˆ x0 (t), ν (t, t) and  (µ (t) − r (t))  , if (µ (t) − r (t)) < σ (t)2 (1 − β), πˆ (t) = σ (t)2 (1 − β)  1, if (µ (t) − r (t)) ≥ σ (t)2 (1 − β), 564 ISHAK ALIA where

 1 1−β R T T  ν(τ,T )  β−1 − R T βr(l)+g(l,πˆ(l)) dl ˜ t {βr(τ)+g(τ,πˆ(τ))}dτ R τ (1−β) ξ (t) = e 1 + t ν(τ,τ) e dτ , for t ∈ [0,T ]. and

 t  1 Z   β−1 ˆ x0 ν(τ,T ) ˜ X (t) = x0 exp r (τ) +π ˆ (τ)(µ (τ) − r (τ)) − ν(τ,τ) ξ (τ) 0 1  Z t  − (ˆπ (τ) σ (τ))2 dτ + πˆ (τ) σ (τ) dW (τ) , for t ∈ [0,T ]. 2 0 Remark 12. In Yong [36], the equilibrium is defined for a class of closed-loop controls. Therein the closed-loop equilibrium strategy is derived via the Equilibrium HJB equation which leads to linear feedback forms:

1  CL  β−1 ∗ (µ (t) − r (t)) ∗ ∗ ϕ (t) ∗ πCL (t) = XCL (t) , and cCL (t) = XCL (t), σ (t)2 (1 − β) ν (t, t) where ϕCL (·) uniquely determined by an integral equation (whose unique solvabil- ity is established). We can easily show that the function ξ˜(·) in our equilibrium consumption solution does not satisfy the integral equation in [36]. Thus our so- lution is different from the one obtained by Yong [36], this is due to the difference between the two definitions of equilibriums (open-loop and closed-loop).

Remark 13. In the case of the classical form of the discount function, the equilib- rium consumption-investment strategy becomes

1   β−1 −δ(T −t) ˜ x0 cˆ(t) = e ξ0 (t) Xˆ (t), and  (µ (t) − r (t))  , if (µ (t) − r (t)) < σ (t)2 (1 − β), πˆ (t) = σ (t)2 (1 − β)  1, if (µ (t) − r (t)) ≥ σ (t)2 (1 − β), where

1−β T 1 R T βr(l)+g(l,πˆ(l)) R  T β−1 − dl  ˜ t {βr(τ)+g(τ,πˆ(τ))}dτ R −δ(T −τ) τ (1−β) ξ0 (t) = e 1 + t e e dτ , for t ∈ [0,T ]. and

 t  1 Z   β−1 x0 −δ(T −τ) ˜ Xˆ (t) = x0 exp r (τ) +π ˆ (τ)(µ (τ) − r (τ)) − e ξ0 (τ) 0 1  Z t  − (ˆπ (τ) σ (τ))2 dτ + πˆ (τ) σ (τ) dW (τ) , for t ∈ [0,T ]. 2 0 This essentially coincide with the optimal solution of classical Merton portfolio problem (see. e.g. [23], the case with constant discount rate). NON-EXPONENTIAL DISCOUNTING OPTIMAL CONTROL PROBLEMS 565

5.2. Time-inconsistent stochastic linear-quadratic problem. Now, we con- sider the case where the state equation is linear in both the state and control, and the cost functional is quadratic in the control. Particularly, we apply the stochastic maximum principle in Theorem 4.2, in order to derive the equilibrium solution. The result in this subsection is comparable with some of the results in [34], [36] and [37]. The dynamics over [0,T ] is given by the following linear controlled SDEJ,

 x0,u(·)  x0,u(·)  dX (s) = A (s) X (s) + B (s) u (s) ds   x0,u(·)  + C (s) X (s) + D (s) u (s) dW (s)  Z + E (s, z) Xx0,u(·) (s−) + F (s, z) u (s−) N˜ (ds, dz), (40)  Z  s ∈ [0,T ],   X (0) = x0, where the functions A : [0,T ] → Rn×n, C : [0,T ] → Rn×n, B : [0,T ] → Rn×m, D : [0,T ] → Rn×m, E : [0,T ] × Z → Rn×n, F : [0,T ] × Z → Rn×m are continuous functions, such that Z Z sup |E (t, z)|2 θ (dz) < ∞ and sup |F (t, z)|2 θ (dz) < ∞. t∈[0,T ] Z t∈[0,T ] Z The gain functional is given by

J (t, x; u (·)) " # 1 Z T = Et,x ν (t, s) hR (s) u (s) , u (s)i ds + ν (t, T ) hGX (T ) ,X (T )i , (41) 2 t

m n t,x,u(·) where R (·) ∈ C [0,T ]; S− , G ∈ S− and X (·) = X (·) solves the SDEJ  dX (s) = {A (s) X (s) + B (s) u (s)} ds   + {C (s) X (s) + D (s) u (s)} dW (s)  Z ˜ (42)  + {E (s, z) X (s−) + F (s, z) u (s−)} N (ds, dz), s ∈ [t, T ],  Z  X (t) = x.

m The control domain is U = R and the set of admissible controls is U0 [t, T ] = 8 m SF (t, T ; R ). Remark 14. In the present case, the equivalent time-consistent optimal control problem is: for any (t, x) ∈ [0,T ] × Rn, Maximize J˜(t, x; u (·)) " # 1 Z T ν (s, s) = Et,x hR (s) u (s) , u (s)i ds + hGX (T ) ,X (T )i , (43) 2 t ν (s, T )

8 m over u (·) ∈ SF (t, T ; R ). 5.2.1. Equilibrium solution. In the next, we apply the stochastic maximum principle in Theorem 4.2 to derive the Nash equilibrium control. Subsequently, for brevity, we suppress the subscript (t) from A (t), B (t), C (t), D (t), E (t, z) and F (t.z), whenever no confusion arises. 566 ISHAK ALIA

In the context of this problem, the Hamiltonian function H takes the following form H (t, x, u, p, q, r (·)) = hp, Ax + Bui + hq, Cx + Dui Z 1 ν (t, t) + hr (z) ,E (z) x + F (z) ui θ (dz) + hR (t) u, ui , Z 2 ν (t, T ) for (t, x, u, p, q, r (·)) ∈ [0,T ] × Rn × Rm × Rn × Rn × L2 (Z, B (Z) , θ; Rn). Accordingly, the first and second order adjoint equations associated to   Xˆ x0 (·) , uˆ (·) are, respectively, given by   Z   > > >  dp (t) = − A p (t) + C q (t) + E (z) r (t, z) θ (dz) dt  Z Z (44) + q (t) dW (t) + r (t, z) N˜ (dt, dz), t ∈ [0,T ],   Z  p (T ) = GXˆ x0 (T ) , and  dP (t) = − A>P (t) + P (t) A + C>P (t) C + Λ (t) C + C>Λ(t)  Z  >  + E (z) (Γ (t, z) + P (t)) E (z) θ (dz)  Z  Z Z  + Γ(t, z) E (z) θ (dz) + E (z)> Γ(t, z) θ (dz) dt (45)  Z Z  Z  + Λ (t) dW (t) + Γ(t, z) N˜ (dt, dz), t ∈ [0,T ],   Z  P (T ) = G. Noting that the terminal condition of Equation (45) is deterministic, it is straight- forward to look at a deterministic solution. Moreover, P (·), Λ (·) and Γ (·, ·) are explicitly given by  h > i  P (t) = E Φ(t, T ) GΦ(t, T ) |Ft , ∀t ∈ [0,T ], Λ(t) = 0, a.e. t ∈ [0,T ],  Γ(t, z) = 0, a.e. (t, z) ∈ [0,T ] × Z, where, for each t ∈ [0,T ], Φ (t, ·) is the unique solution to the linear SDE   dΦ(t, r) = A (r)Φ(t, r) dr + C (r)Φ(t, r) dW (r)  Z + E (r, z)Φ(t, r−) N˜ (dr, dz), r ∈ [t, T ],  Z  Φ(t, t) = In. where In denotes the (n × n) identity matrix. Consequently, the H-function in (18) takes the form: for (t, x, u) ∈ [0,T ] × Rn × Rm, H (t, x, u) = hp (t) , Ax + Bui + hq (t) , Cx + Dui Z 1 ν (t, t) + hr (t, z) ,E (z) x + F (z) ui θ (dz) + hR (t) u, ui Z 2 ν (t, T ) 1    > + C x − Xˆ x0 (t) + D (u − uˆ (t)) P (t) 2     × C x − Xˆ x0 (t) + D (u − uˆ (t)) NON-EXPONENTIAL DISCOUNTING OPTIMAL CONTROL PROBLEMS 567

1Z    > + E (z) x − Xˆ x0 (t) + F (z)(u − uˆ (t)) P (t) 2 Z     × E (z) x − Xˆ x0 (t) + F (z)(u − uˆ (t)) θ (dz).

Remark 15. From the assumption that G ≤ 0 and R (t) ≤ 0, it follows that   P (t) ≤ 0 and, consequently, the function H t, Xˆ x0 (t) , · is concave with respect to u. As a consequence of Theorem 4.2 we have the following result. 8 m Theorem 5.2. Given an admissible control uˆ (·) ∈ SF (t, T ; R ), let (p (·) , q (·) , r (·, ·)) be the unique solution to the BSDE (44). Then uˆ (·) is an equilibrium control for the LQ problem (41) − (42), if and only if, the following condition holds Z > > > ν(t,t) B p (t) + D q (t) + F (z) r (t, z) θ (dz) + ν(t,T ) R (t)u ˆ (t) = 0, Z (46) P−a.s., a.e. t ∈ [0,T ] . Proof. By Theorem 4.2 and by virtue of Remark7, ˆu (·) is an equilibrium control for   the LQ problem, if and only if, the function H t, Xˆ x0 (t) , · attains its maximum atu ˆ (t) , P−a.s., a.e. t ∈ [0,T ]. It is easy to verify that (46) consists of nothing but the first-order necessary and sufficient optimality condition of the maximum point   uˆ (t) for the quadratic function H t, Xˆ x0 (t) , · .

Now, let us introduce the following generalized Riccati differential equation  dM 0 = (t) + M (t) A + A>M (t) + C>M (t) C + R E (z)> M (t) E (z) θ (dz)  Z  dt   − M (t) B + C>M (t) D + R E (z)> M (t) F (z) θ (dz) Ψ(t),  Z   ν (t, t) −1 Ψ(t) = R (t) + D>M (t) D + R F (z)> M (t) F (z) θ (dz) ×  Z  ν (t, T )   > > R >   B M (t) + D M (t) C + F (z) M (t) E (z) θ (dz) , t ∈ [0,T ],  Z  M (T ) = G. (47) The following theorem presents the existence condition for a linear feedback equilibrium control. Theorem 5.3. Suppose that Equation (47) admits the solution M (·) ∈ C ([0,T ]; Rn×n) . Then the time-inconsistent LQ problem (41) − (42) has an equilibrium con- trol that can be represented by the state feedback form: uˆ (t) = −Ψ(t) Xˆ x0 (t) , for t ∈ [0,T ] . (48) Proof. Let M (·) be the solution of (47). Then the following linear SDE  n o n o  dXˆ x0 (s) = (A − BΨ(s)) Xˆ x0 (s) ds + (C − DΨ(s)) Xˆ x0 (s) dW (s)   Z n o + (E (z) − F (z)Ψ(s)) Xˆ x0 (s−) Ne (ds, dz) , for s ∈ [0,T ],  Z  x0  Xˆ (0) = x0,

ˆ x0 q n has a unique solution X (·) ∈ SF (0,T ; R ), for q ≥ 2. Hence the controlu ˆ (·) defined by (48) is admissible. Moreover, we claim that p (·), q (·) and r (·, ·) are, respectively, given by 568 ISHAK ALIA

p (t) = M (t) Xˆ x0 (t) , (49) q (t) = M (t)(C − DΨ(t)) Xˆ x0 (t) , (50) x r (t, z) = M (t)(E (z) − F (z)Ψ(t)) Xˆ 0 (t), P − a.s, a.e. (t, z) ∈ [0,T ] × Z. (51) Indeed, define p˜(t) = M (t) Xˆ x0 (t), q˜(t) = M (t)(C − DΨ(t)) Xˆ x0 (t), x r˜(t, z) = M (t)(E (z) − F (z)Ψ(t)) Xˆ 0 (t), P − a.s, a.e. (t, z) ∈ [0,T ] × Z. By argument of Itˆo’sformula and using Equation (47), we can show that (˜p (·) , q˜(·) , r˜(·, ·)) satisfies  n o dp˜(t) = − A>p˜(t) + C>q˜(t) + R E (z)> r˜(t, z) θ (dz) dt +q ˜(t) dW (t)  Z  Z + r˜(t, z) N˜ (dt, dz), t ∈ [0,T ],  Z  p˜(T ) = GXˆ x0 (T ). It then follows from the uniqueness of the solution to BSDE (44) that (˜p (t) , q˜(t) , r˜(t, z)) ≡ (p (t) , q (t) , r (t, z)) , which prove our claim. Now, plugging (48) , (49) , (50) and (51) into the expression Z ν (t, t) U (t) = B>p (t) + D>q (t) + F (z)> r (t, z) θ (dz) + R (t)u ˆ (t), Z ν (t, T ) we get Z ν (t, t) U (t) = B>p (t) + D>q (t) + F (z)> r (t, z) θ (dz) + R (t)u ˆ (t) Z ν (t, T ) = B>M (t) Xˆ x0 (t) + D>M (t)(C − DΨ(t)) Xˆ x0 (s) Z > + F (z) M (t)(E (z) − F (z)Ψ(t)) Xˆ x0 (t) θ (dz) Z ν (t, t) − R (t)Ψ(t) Xˆ x0 (t) ν (t, T )  Z  > = B>M (t) + D>M (t) C + F (z) M (t) E (z) θ (dz) Xˆ x0 (t) Z  Z  > ν (t, t) − D>M (t) D + F (z) M (t) F (z) θ (dz) + R (t) Ψ(t) Xˆ x0 (t) Z ν (t, T ) = 0.

Thus, it follows from Theorem 5.2 thatu ˆ (·) is an equilibrium control for the LQ problem (41) − (42). Remark 16. Note that, it is not difficult to verify that Assumptions (H1*)-(H4) are satisfied in the framework of the time-inconsistent LQ problem (41) − (42) . Therefore, it follows from Theorem 4.3 that the equilibrium control given by (48) is optimal for the equivalent time-consistent LQ problem (43) . We turn our attention to the Riccati equation (47). NON-EXPONENTIAL DISCOUNTING OPTIMAL CONTROL PROBLEMS 569

Proposition 2. Suppose that the functions ν (·, ·) and R (·) satisfy ν (t, t) R (t) ≤ I , a.e. t ∈ [0,T ] , ν (t, T ) m for some  > 0. Then the Riccati equation (47) admits a unique solution M (·) ∈ n  C [0,T ]; S− . Proof. The proof of this theorem consists in straightforward extensions of the proof of Theorem 7.2 in Yong and Zhou [38] (pp. 319-323). We omit it. Concluding remarks In this paper we have investigated open-loop equilibrium controls to a general discounting time-inconsistent stochastic control problem. We have shown that our time-inconsistent problem is equivalent to a standard optimal control problem in the sense that the equilibrium controls for the standard problem coincide with the equilibrium controls for the time-inconsistent problem. This link allowed us to char- acterize the equilibrium controls by using the standard techniques in the classical optimal control theory. The inclusion of concrete examples confirms the validity of our proposed study. We believe that, it would be very interesting to extend this approach to a fairly general class of time-inconsistent stochastic control problems. The research on this topic is in progress and will appear in our forthcoming paper.

Acknowledgments. The author would like to thank two anonymous referees for their careful reading and suggestive comments which lead to the improved version of the paper.

REFERENCES

[1] G. Ainslie, Specious reward: A behavioral theory of impulsiveness and impulse control, Psy- chological Bulletin, 82 (1975), 463–496. [2] I. Alia, F. Chighoub, N. Khelfallah and J. Vives, Time-consistent investment and consumption strategies under a general discount function, preprint, arXiv:1705.10602. [3] N. Azevedo, D. Pinheiro and G. W. Weber, Dynamic programming for a Markov-switching jump–diffusion, Journal of Computational and Applied Mathematics, 267 (2014), 1–19. [4] R. J. Barro, Ramsey meets Laibson in the neoclassical growth model, Quarterly Journal of Economics, 114 (1999), 1125–1152. [5] S. Basak and G. Chabakauri, Dynamic mean-variance asset allocation, Review of Financial Studies, 23 (2010), 2970–3016. [6] T. Bj¨orkand A. Murgoci, A general theory of Markovian time-inconsistent stochastic control problems, 2010. Available from: https://ssrn.com/abstract=1694759. [7] T. Bj¨ork, A. Murgoci and X. Y. Zhou, Mean-variance portfolio optimization with state- dependent risk aversion, Mathematical Finance, 24 (2014), 1–24. [8] T. Bj¨orkand A. Murgoci, A theory of Markovian time-inconsistent stochastic control in discrete time, Finance and Stochastics, 18 (2014), 545–592. [9] T. Bjork, M. Khapko and A. Murgoci, On time-inconsistent stochastic control in continuous time, Finance and Stochastics, 21 (2017), 331–360. [10] C. Czichowsky, Time-consistent mean-variance porftolio selection in discrete and continuous time, Finance and Stochastics, 17 (2013), 227–271. [11] B. Djehiche and M. Huang, A characterization of sub-game perfect Nash equilibria for SDEs of mean field type, Dynamic Games and Applications, 6 (2016), 55–81. [12] Y. Dong and R. Sircar, Time-inconsistent portfolio investment problems, in Stochastic Anal- ysis and Applications, Springer, 100 (2014), 239–281. [13] I. Ekeland and A. Lazrak, Equilibrium policies when preferences are time-inconsistent, preprint, arXiv:0808.3790v1. [14] I. Ekeland and T. A. Pirvu, Investment and consumption without commitment, Mathematics and Financial Economics, 2 (2008), 57–86. 570 ISHAK ALIA

[15] I. Ekeland, O. Mbodji and T. A. Pirvu, Time-consistent portfolio management, SIAM Journal on Financial Mathematics, 3 (2012), 1–32. [16] S. M. Goldman, Consistent plans, Review of Financial Studies, 47 (1980), 533–537. [17] Y. Hu, H. Jin and X. Y. Zhou, Time-inconsistent stochastic linear quadratic control, SIAM Journal on Control and Optimization, 50 (2012), 1548–1572. [18] Y. Hu, H. Jin and X. Y. Zhou, Time-inconsistent stochastic linear quadratic control: Char- acterization and uniqueness of equilibrium, SIAM J. Control Optim., 55 (2017), 1261–1279, arXiv:1504.01152. [19] Y. Hu, J. Huang and X. Li, Equilibrium for time-inconsistent stochastic linear–quadratic control under constraint, preprint, arXiv:1703.09415v1. [20] P. Krusell and A. Smith, Consumption and savings decisions with quasi-geometric discounting, Econometrica, 71 (2003), 365–375. [21] F. E. Kydland and E. Prescott, Rules rather than discretion: The inconsistency of optimal plans, Journal of Political Economy, 85 (1997), 473–492. [22] G. Loewenstein and D. Prelec, Anomalies in intertemporal choice: Evidence and an interpre- tation, Choices, Values, and Frames, (2019), 578–596. [23] J. Marin-Solano and J. Navas, Consumption and portfolio rules for time-inconsistent investors, European Journal of Operational Research, 201 (2010), 860–872. [24] J. Marin-Solano and E. V. Shevkoplyas, Non-constant discounting and differential games with random time horizon, Automatica IFAC Journal, 47 (2011), 2626–2638. [25] Q. Meng, General linear quadratic optimal stochastic control problem driven by a Brown- ian motion and a Poisson random martingale measure with random coefficients, Stochastic Analysis and Applications, 32 (2014), 88–109. [26] R. C. Merton, Optimum consumption and portfolio rules in a continuous-time model, Journal of Economic Theory, 3 (1971), 373–413. [27] B. Oksendal and A. Sulem, Applied Stochastic Control of Jump Diffusions, 2nd edition, Springer, 2007. [28] E. S. Phelps and R. A. Pollak, On second-best national saving and game-equilibrium growth, Studies in Macroeconomic Theory, (1980), 201–215. [29] R. A. Pollak, Consistent planning, Review of Economic Studies, 35 (1968), 201–208. [30] Y. Shen and T. K. Siu, The maximum principle for a jump-diffusion mean-field model and its application to the mean-variance problem, Nonlinear Analysis, 86 (2013), 58–73. [31] R. Strotz, Myopia and inconsistency in dynamic utility maximization, Readings in Welfare Economics, (1973), 128–143. [32] S. Tang and X. Li, Necessary conditions for optimal control for stochastic systems with random jumps, SIAM Journal on Control and Optimization, 32 (1994), 1447–1475. [33] Q. Wei, J. Yong and Z. Yu, Time-inconsistent recursive stochastic optimal control problems, SIAM J. Control Optim., 55 (2017), 4156–4201, arXiv:1606.03330v1. [34] J. Yong, A deterministic linear quadratic time-inconsistent optimal control problem, Mathe- matical Control and Related Fields, 1 (2011), 83–118. [35] J. Yong, Deterministic time-inconsistent optimal control problems-An essentially cooperative approach, Acta Mathematicae Applicatae Sinica (English Series), 28 (2012), 1–30. [36] J. Yong, Time-inconsistent optimal control problems and the equilibrium HJB equation, Mathematical Control and Related Fields, 2 (2012), 271–329. [37] J. Yong, Linear-quadratic optimal control problems for mean-field stochastic differential equations–time-consistent solutions, Transactions of the American Mathematical, 369 (2017), 5467–5523. [38] J. Yong and X. Y. Zhou, Stochastic Controls: Hamiltonian Systems and HJB Equations, Springer-Verlag, New York, 1999. [39] Q. Zhao, On Time-Inconsistent Investment and Dividend Problems, PhD thesis, Australia : Macquarie University, (2015). [40] Q. Zhao, Y. Shen and J. Wei, Consumption-investment strategies with non-exponential dis- counting and logarithmic utility, European Journal of Operational Research, 238 (2014), 824–835. Received August 2017; revised November 2018. E-mail address: [email protected]