University of Texas at El Paso DigitalCommons@UTEP
Open Access Theses & Dissertations
2018-01-01 On Numerical Stochastic Optimal Control Via Bellman's Dynamic Programming Principle Prince Osei Aboagye University of Texas at El Paso, [email protected]
Follow this and additional works at: https://digitalcommons.utep.edu/open_etd Part of the Mathematics Commons
Recommended Citation Aboagye, Prince Osei, "On Numerical Stochastic Optimal Control Via Bellman's Dynamic Programming Principle" (2018). Open Access Theses & Dissertations. 1386. https://digitalcommons.utep.edu/open_etd/1386
This is brought to you for free and open access by DigitalCommons@UTEP. It has been accepted for inclusion in Open Access Theses & Dissertations by an authorized administrator of DigitalCommons@UTEP. For more information, please contact [email protected]. ON NUMERICAL STOCHASTIC OPTIMAL CONTROL VIA
BELLMAN’S DYNAMIC PROGRAMMING PRINCIPLE
PRINCE OSEI ABOAGYE
Master’s Program in Mathematical Sciences
APPROVED:
Michael Pokojovy, Ph.D., Chair
Sangjin Kim, Ph.D.
Thompson Sarkodie-Gyan, Ph.D.
Charles Ambler, Ph.D. Dean of the Graduate School Copyright ©
by
Prince Osei Aboagye
2018 ON NUMERICAL STOCHASTIC OPTIMAL CONTROL VIA
BELLMAN’S DYNAMIC PROGRAMMING PRINCIPLE
by
PRINCE OSEI ABOAGYE, B.A.
THESIS
Presented to the Faculty of the Graduate School of
The University of Texas at El Paso
in Partial Fulfillment
of the Requirements
for the Degree of
MASTER OF SCIENCE
Department of Mathematical Sciences
THE UNIVERSITY OF TEXAS AT EL PASO
August 2018 Acknowledgements
I give all the glory and honor to God for successfully taking me through my Master’s degree. I am grateful to the Department of Mathematics at UTEP for this educational opportunity, to the professors that guided me and kept me in track. I also thank my committee for taking the time to attend my presentation and for their feedback, in particular, special thanks to my mentor Dr. Michael Pokojovy for guiding me into an amazing career path in mathematics and statistics and for his lasting patience and understanding. Lastly, my warm appreciation goes to my parents and siblings for the moral and emotional support. I would like to also express my heartfelt gratitude to my grandfather, Mr. George Ansanyi, who has single-handedly funded my education from senior high school to the university level where I completed my undergraduate degree. Grandpa, God richly bless you. Without them this wouldn’t have happened, they make me feel blessed.
iv Abstract
In this work, we present an application of Stochastic Control Theory to the Merton’s portfolio optimization problem. Then, the dynamic programming methodology is applied to reduce the whole problem to solving the well-known HJB (Hamilton-Jacobi-Bellman) equation that arises from the Merton’s portfolio optimization problem subject to the power utility function. Finally, a numerical method is proposed to solve the HJB equation and the optimal strategy. The numerical solutions are compared with the explicit solutions for optimal consumption and investment control policies.
v Table of Contents
Page
Acknowledgements ...... iv
Abstract ...... v
Table of Contents ...... vi
List of Figures ...... x
1 Introduction ...... 1
2 Stochastic Calculus ...... 5
2.1 Preliminaries ...... 5
2.1.1 Probability Space ...... 5
2.1.2 Random Variable ...... 5
2.1.3 Brief Sketch of Lebesgue’s Integral ...... 6
2.1.4 Convergence Concepts for Random Variables ...... 9
2.1.5 The Lebesgue-Stieltjes Integral ...... 11
2.2 Stochastic Processes and Brownian Motion ...... 11
2.2.1 Discrete Stochastic Processes ...... 12
2.2.2 Continuous Stochastic Processes ...... 13
2.2.3 Martingales ...... 14
2.2.4 Stopping Times and Optional Stopping ...... 15
2.2.5 The Wiener Process and White Noise ...... 16
2.2.6 Existence: A Multiscale Construction ...... 18
vi 2.2.7 White Noise ...... 19
2.3 The Stochastic Integral ...... 20
2.3.1 Some Elementary Properties ...... 22
2.3.2 The ItˆoCalculus ...... 23
2.3.3 Girsanov’s Theorem ...... 24
2.3.4 The Martingale Representation Theorem ...... 25
3 Stochastic Control and Dynamic Programming ...... 27
3.1 Stochastic Control Problems in Standard Form ...... 27
3.2 The Dynamic Programming Principle ...... 30
3.2.1 A weak Dynamic Programming Principle ...... 30
3.3 The Dynamic Programming Equation ...... 31
3.3.1 Continuity of the Value Function for Bounded Controls ...... 32
4 Optimal Stopping and Dynamic Programming ...... 33
4.1 Optimal Stopping Problems ...... 33
4.2 The Dynamic Programming Principle ...... 34
4.3 The Dynamic Programming Equation ...... 35
4.4 Regularity of the Value Function ...... 35
4.4.1 Finite Horizon Optimal Stopping ...... 35
5 Solving Control Problems by Verification ...... 36
5.1 The Verification Argument for Stochastic Control Problems ...... 36
5.2 The Verification Argument for Optimal Stopping Problems ...... 39
6 Introduction to Viscosity Solutions ...... 41
vii 6.1 Intuition Behind Viscosity Solutions ...... 41
6.2 Definition of Viscosity Solutions ...... 42
6.3 First Properties ...... 43
6.4 Comparison Result and Uniqueness ...... 44
6.4.1 Comparison of Classical Solutions in a Bounded Domain ...... 45
6.4.2 Semijets Definition of Viscosity Solutions ...... 45
6.4.3 The Crandall-Ishii’s lemma ...... 47
6.4.4 Comparison of Viscosity Solutions in a Bounded Domain ...... 47
6.5 Comparison in Unbounded Domains ...... 48
7 Finite Difference Numerical Approximations ...... 49
7.1 Introduction ...... 49
7.2 Controlled Discrete Time Markov Chains ...... 49
7.3 Finite Difference Approximations to HJB Equations ...... 52
7.4 Convergence of Finite Difference Approximations ...... 60
8 Application: Merton’s Portfolio Optimization Problem ...... 62
8.1 The Merton’s Portfolio Optimization Problem and the HJB Equation . . . 63
8.2 The HJB Equation ...... 64
8.2.1 Utility Function ...... 66
9 Numerical Results ...... 69
9.1 Brief Summary of the Steps Involved in Obtaining the HJB Equations As- sociated with the Merton’s Portfolio Problem ...... 69
10 Conclusions and Further Directions ...... 76
Bibliography ...... 77
viii Appendix ...... 79
Curriculum Vitae ...... 88
ix List of Figures
9.1 Comparing theoretical and numerical HJB...... 73
9.2 Numerical approximation of the optimal investment and consumption strate- gies...... 74
9.3 Wealth process ...... 75
x Chapter 1: Introduction
The main thrust of the Stochastic Control Theory is to determine an optimal control strategy for a controlled Markovian diffusion with the goal of optimizing a certain criterion of interest such as maximizing the expected discounted utility of an investment strategy or the expected policy return of the learning strategy, minimizing the average costs or the mean/median risk of a business operation or the discrepancy from a desired trajectory of a self-driving vehicle, etc.
High-frequency data commonly occur in various areas of science and engineering ranging from mathematical finance (high-frequency trading), data science (time series analysis) and computer science (signal processing) to quality engineering (control charts) and geological sciences (analyzing seismic data), etc. In many cases, the underlying data generating process can adequately be modeled as a Markovian diffusion being a solution to a wide class of stochastic differential equations.
The canonical application area of Markovian diffusions is mathematical finance. Indeed, Markovian diffusions or stochastic differential equations are a hypernym for a whole class of stochastic models such as the standard geometric Brownian motion, Heston model, CEV model, SABR volatility model, (continuous) GARCH model, Chen model, Vasiˇcekmodel, Hull-White model, etc. (only to name a few). Due to their generality, Markovian diffusions are further applied to
1. approximate computation of the ARL (average run length) and control limits for CUSUM- and EWMA-type control charts,
2. designing and assessing sequential tests in bio-statistical applications,
3. describing the limiting behavior of HMM (Hidden Markov Models),
1 4. approximating the limiting behavior of time-discrete econometric models (GARCH, ARMA, etc.),
5. design of experiments for temporal data,
6. longitudinal and functional data analysis (FDA),
7. machine learning (reinforcement learning, temporal difference learning, etc.),
8. pattern recognition and discriminant analysis for geological data, etc.
Once the data are analyzed and the parameters of the underlying data generating process are estimated, further inference procedures such as predictive analysis, statistical tests, confidence interval or region construction, etc., can be performed by the analyst to facilitate the decision making procedure. In many cases, the temporal evolution of the process can be controlled through an external input. For example, consider a portfolio managed by a financial institution. The portfolio value is directly affected by the portfolio structure and the price of the assets in the portfolio. Whereas the former can be selected according to an investment strategy, the latter (to a large extent) is determined by the market and, thus, falls beyond one’s control.
Research in this field was pioneered by Bellman and Pontryagin. However a large num- ber of research on control theory has developed over recent years, inspired in particular by problems from mathematical finance. The dynamic programming principle (DPP) to a stochastic control problem for Markov processes leads to a nonlinear partial differential equation (PDE), called the Hamilton-Jacobi- Bellman (HJB), which was pioneered by Bell- man. These PDE are named after Sir William Rowan Hamilton, Carl Gustav Jacobi and Richard Bellman [1].
Even parameter estimation for Markovian diffusions can often be reduced to optimal control problems [2], etc. Due to their complexity, only very few optimal control problems for stochastic differential equations can be solved explicitly. Thus, there is an increasing need
2 for fast, efficient and reliable numerical techniques for solving this kind of problems. The goal of this thesis is to present a numerical technique for solving the HJB equation and the optimal strategy arising from the Merton’s portfolio optimization problem.
In the classical Merton portfolio optimization problem, an investor endowed with an initial capital consumes a certain amount of the capital and invests the remaining wealth into the financial market [3]. The investor has two investment options; a riskless asset with constant interest rate and a risky asset whose price is assumed to follow a geometric Brownian motion. Given a fixed investment time period, the investors objective is to find an optimal consumption-investment strategy in order to maximize the expected utility of wealth at the terminal trading time and of intermediate consumption [3].
In his landmark paper (Merton 1969), Merton formulated the optimal investment and consumption problem as a stochastic optimal control problem which could be solved with the dynamic programming principle. As a result, it leads to the well-known Hamilton- Jacobi-Bellman (HJB) equation, potentially a fully nonlinear partial differential equation (PDE). The main difficulty of the Merton problem is the nonlinearity of the HJB equation, which makes it difficult to solve analytically [4].
Our proposed approach will be to derive the dynamic programming equations for the re- sulting Merton portfolio optimization problem and solve it numerically using the finite difference scheme.
This thesis is organized as follows. In this Chapter I gave a brief introduction to Stochastic Calculus. Stochastic Calculus serves as a fundamental tool throughout this thesis. Then in Chapter 2 we will outline the basic structure of a stochastic optimization problem and the features it is formulated with and also we will show how the Dynamic Programming Principle can be used to derive the HJB equation. In Chapter 3, our objective will be to derive similar results as those obtained in the previous Chapter for standard stochastic control problems in the context of optimal stopping problems. In Chapter 4, we will look at how to verify that some “guess” of the value function is indeed equal to the unknown
3 value function. In Chapter 5, we will introduce the concept of viscosity solution to the HJB equation, since for deriving the HJB equation we assume our solution to be smooth enough, which is not always the case. Viscosity solutions are proposed to circumvent this difficulty. In Chapter 6, we will give a brief introduction of the topic of numerical solution of HJB partial differential equations using the finite difference scheme. In Chapter 7, we will present an application in the form of an asset allocation problem thus the Merton’s portfolio optimization problem. Finally, In Chapter 8, we will use the finite difference numerical scheme to solve the HJB equation derived from the Merton’s portfolio optimiza- tion problem and obtain the explicit solutions for optimal consumption and investment control policies. Our main contribution in this thesis was to numerically solve the Merton’s portfolio optimization problem with the Bellman’s Dynamic Programming Principle.
4 Chapter 2: Stochastic Calculus
In this Chapter we are closely following [5], [6], [7], [8] without claiming direct authorship. Our thrust is only to summarize.
2.1 Preliminaries
2.1.1 Probability Space
Definition 2.1. A triple (Ω, F,P ) is called a probability space provided Ω is a nonempty set, F is a σ-algebra of subsets of Ω, and P is a σ-additive measure on (Ω, F) with P [Ω] = 1 [5].
2.1.2 Random Variable
Definition 2.2. The Borel subsets of Rn, denoted B comprise the smallest σ-algebra of subsets of Rn containing all open sets [5].
Definition 2.3. Let (Ω, F,P ) be a probability space. A mapping X :Ω → Rn is called an n-dimensional random variable if for each B ∈ B, we have X−1(B) ∈ F. We equivalently say that X is F-measurable [5].
5 2.1.3 Brief Sketch of Lebesgue’s Integral
The Lebesgue integral of a random variable X can be defined in three steps [6].
1. For a discrete random variable of the form
n X X = αi1Ai , αi ∈ R,Ai ∈ F i=1
the integral of X is defined as
X [X] := X(ω) dP (ω) := α P [A ] . E ˆ i i Ω i
2. Let ξ denote the set of all discrete random variables. Consider the set of all random variables which are monotone limits of discrete random variables, i.e., define
∗ ξ := {X : ∃ u1 ≤ . . . , un ∈ ξ, un ↑ X}
Remark: X random variable with X ≤ 0 ⇒ X ∈ ξ∗. For X ∈ ξ∗ define
X dP := lim un dP. ˆ n→∞ˆ Ω Ω
3. For an arbitrary random variable X consider the decomposition X = X+ − X− with
X+ := sup(X, 0) ,X− := sup(−X, 0)
According to (2), X+,X− ∈ ξ∗.
If either E [X−] < ∞ or E [X+] < ∞, define
X dP := X+ dP − X− dP. ˆ ˆ ˆ Ω Ω Ω
6 Properties of the Lebesgue Integral:
• Linearity: (αX + βY ) dP = α X dP + β Y dP ´Ω Ω´ Ω´ • Positivity : X ≥ 0 implies X dP ≥ 0 and ´Ω
X dP > 0 ⇔ P [X > 0] > 0. ˆ Ω
• Monotone Convergence (Beppo Levi).
Let (Xn) be a monotone sequence of random variables (i.e., Xn ≤ Xn+1) with
X1 ≥ C. Then ∗ X := limXn ∈ ξ n
and
lim Xn dP = lim Xn dP = X dP. n→∞ˆ ˆ n→∞ ˆ Ω Ω Ω
• Fatou’s Lemma
(i) For any sequence (Xn) of random variables which are bounded from below, one has
lim inf Xn dP ≤ lim inf Xn dP. ˆ n→∞ n→∞ ˆ Ω Ω
• For any sequence (Xn) of random variables bounded from above, one has
lim sup Xn dP ≥ lim sup Xn dP. ˆ n→∞ n→∞ ˆ Ω Ω
• Jensen’s Inequality Let X be an integrable random variable with values in R and u : R → R a convex function. Then one has
u (E [X]) ≤ E [u (X)] .
7 Lp−Spaces (1 ≤ p < ∞)
Lp (Ω) denotes the set of all real-valued random variables X on (Ω, F,P ) p with E [|X| ] < ∞ for some (1 ≤ p < ∞). For X ∈ Lp, the Lp − norm is defined as
1 p p kXkp := (E [|X| ]) .
The Lp − norm has the following properties:
1. H¨older’s Inequality
p p 1 1 Given X ∈ L (Ω) and Y ∈ L (Ω) with p + q = 1, one has
1 1 p q |X| · |Y | dP ≤ |X|p dP · |Y |q dP dP < ∞, ˆ ˆ ˆ Ω Ω Ω
In particular, since |X · Y | ≤ |X| · |Y |, implies X · Y ∈ L1 (Ω) .
2. Lp (Ω) is a normed vector space. In particular, X,Y ∈ Lp X + Y ∈ Lp and one has
kX + Y kp ≤ kXkp + kY kp . (triangle inequality)
3. Lq ⊂ Lp for p < q.
8 2.1.4 Convergence Concepts for Random Variables
Definition 2.4. Let (X ) , X be random variables on (Ω, F,P ) . n n∈N
1. The sequence (Xn) converges to XP − almost surely if
P [{ω : Xn (ω) → X (ω)}] = 1.
We will then write Xn → XP − a.s.
2. The sequence (Xn) converges in probability if, for every > 0
lim P [|Xn − X| > ] = 0. n→∞
We will then write P − lim Xn = X.
p p 3. Let (Xn) be in L (Ω) for some p ∈ [1, ∞). The sequence (Xn) converges to X in L if
1 p p lim kXn − Y k = lim ( [|Xn − Y | ]) = 0. n→∞ p n→∞ E
Definition 2.5. The sequence (Xn) converges to X weakly ( in distribution) if, for every continuous bounded function f : E → R,
lim f (Xn) dPn = lim f (X) dP. n→∞ˆ n→∞ˆ Ωn Ω
D We will then write Xn → X.
9 Proposition 2.6. Let X be a random variable and (Xn) be a sequence of random variables on the probability space (Ω, F,P ) . The following implications hold [7]:
1. Xn → X a.s. =⇒ Xn → X in probability.
p 2. Xn → X in L =⇒Xn → X in probability.
p q 3. Xn → X in L =⇒Xn → X in L (q ≤ p) .
4. Xn → X in probability =⇒ Xn → X in law.
Definition 2.7. The sequence (Xn) is called uniformly integrable
lim sup |Xn| dP = 0. C→∞ n ˆ |Xn|>C
Sufficient conditions for uniform integrability are the following:
p 1. sup E [|X| ] < ∞ for some p > 1, n
1 2. There exists a random variable Y ∈ L such that |Xn| ≤ YP − a.s. for all n. Condition 2. is Lebesgue’s “dominated convergence” condition.
10 2.1.5 The Lebesgue-Stieltjes Integral
Consider a real-valued random variable X on (Ω, F,P ) and a Borel- measurable mapping f : R → R. i.e. we have
X f (Ω, F,P ) → (R, B,PX ) → R with −1 PX [B] : = P [X (B)] distribution of X
FX (x) : = PX [[−∞, x]] = P [X ≤ x] distribution function of X.
Then the (Lebesgues-Stieltjes) integral f(x) dFX (x) is well defined due to the following ´R integral transformation formula:
Proposition 2.8.
f ◦ X dP = f dP = f(x) dF (x) . ˆ ˆ X ˆ X Ω Ω Ω
Properties of F = FX :
1. F is isotone, i.e. x ≤ y =⇒ F (x) ≤ F (y),
2. F is right continuous,
3. lim F (x) = 0; lim F (x) = 1. n→−∞ n→∞
2.2 Stochastic Processes and Brownian Motion
A stochastic process is a family of random variables {X(t), t ∈ τ} defined on a probability space (Ω, F,P ) and indexed by a parameter t where t varies over a set τ . If the set τ is discrete, the stochastic process is called discrete. If the set τ is continuous, the stochastic process is called continuous. The parameter t usually plays the role of time and the random variables can be discrete valued or continuous-valued at each value of t [8].
11 2.2.1 Discrete Stochastic Processes
Let τ = {t0, t1, t2,...} be a set of discrete times. Let each element of the sequence of random variables X(t0),X(t1),X(t2),... be defined on the sample space Ω. The sequence
{Xn} is said to be a Markov process if the future is independent of the past given the future. A discrete-valued Markov process is called a Markov chain.
Let P (Xn+1 = xn+1 | Xn = xn) define the one-step transition probabilities for a Markov chain. That is, P (Xn+1 = xn+1 and Xn = xn) = P (Xn+1 = xn+1 | Xn = xn)P (Xn = xn) . If the transition probabilities are independent of time tn, then the Markov chain is said to have stationary transition probabilities and the Markov chain is referred to as a homogeneous Markov chain [8].
Definition 2.9. Let (Ω, F,P ) be a probability space. A (discrete time) filtration is an increasing sequence {Fn} of σ − algebras F0 ⊂ F1 ⊂ · · · ⊂ F. The quadruple
(Ω, F, {Fn} ,P ) is called a filtered probability space [7].
Definition 2.10. Let (Ω, F, {Fn} ,P ) be a filtered probability space. A stochastic pro- cess {Xn} is called (Fn−) adapted if Xn is Fn−measurable for every n, and is called
(Fn − predictable) if Xn is Fn−1−measurable for every n [7].
Hence, if {Xn} is adapted, then it means that Xn is a measurement of something in the past or present (up to and including time n), while in the predictable case Xn represents a measurement of something in the past (before time n) [7].
Definition 2.11. Let (Ω, F,P ) be a probability space and {Xn} be a stochastic process. X The filtration generated by {Xn} is defined as Fn = σ {X0,...,Xn}, and the process {Xn} X is Fn −adapted by construction [7].
12 2.2.2 Continuous Stochastic Processes
Let {X(t), t ∈ τ} be a continuous stochastic process defined on the probability space (Ω, F,P ), where τ = [0,T ] is an interval in time and the process is defined at all instants of time in the interval. A continuous-time stochastic process is a function X : τ × Ω → R of two variables t and ω and X and may be discrete-valued or continuous-valued. In par- ticular, X(t) = X(t, ·) is a random variable for each value of t ∈ τ and X(·, ω) maps the interval τ into R and is called a sample path, a realization, or a trajectory of the stochastic process for each ω ∈ Ω [8].
Remark 2.12. From this point, we will primarily work in continuous time t ∈ [0, ∞] .
Definition 2.13. Let Xt be a stochastic process on some filtered probability space
(Ω, F, {Ft} ,P ) and time set T ⊂ [0, ∞] . Then Xt is called adapted if Xt is Ft−measurable for all t, is measurable if the random variable X : T × Ω → R is B (T) × F−measurable, and is called progressively measurable if X : [0, t] ∩ T × Ω → R is
B ([0, t] ∩ T) × Ft−measurable for all t [7].
If Xt is Fn− adapted and measurable, then it is Ft−progressively measurable [7].
Lemma 2.14. Let the process Xt have continuous sample paths. Then the random vari- ables inf Xt, supXt , lim inf Xt and lim supXt are measurable [7]. t t t t
Definition 2.15. Let Xt be an a.s. nonnegative stochastic process with continuous sample 1 paths. Then lim inf Xt ≤ lim inf (Xt). If there is a Y ∈ L such that Xt ≤ Y a.s. E t t E for all t, then E lim supXt ≤ lim supXt E (Xt). t t
13 2.2.3 Martingales
A martingale is a very special type of stochastic process.
Definition 2.16. (Martingale) Let (Ω, F, {Fn} ,P ) be a filtered probability space, and let M(·) be an Fn−adapted integrable process with values in a separable Banach space W . Then M(·) is said to be a martingale if, for all r, s ∈ [t, T ] , s ≤ r,
E [M(r) | Fn] = M(s) P-a.s.
If W = R, we say that M(s) is a submartingale (respectively, supermartingale) if
E [M(r) | Fn] ≥ M(s), (respectively, E [M(r) | Fn] ≤ M(s)) P-a.s.
Lemma 2.17. (Doob’s decomposition). Let (Ω, F,P ) be a probability space, let Fn 1 be a filtration and let {Xn} be Fn− adapted with Xn ∈ L for every n.
Then Xn = X0 + An + Mn P-a.s., where {An} is Fn− predictable and {Mn} is an
Fn−martingale with M0 = 0. Moreover, this decomposition is unique [7].
Definition 2.18. Let {Mn} be a martingale and {An} be a predictable process. Then n P (A · M)n = Ak(Mk −Mk−1), the martingale transform of M by A, is again a martingale, k=1 1 provided that An and (A · M)n are in L for all n [7].
Lemma 2.19. (Doob’s upcrossing lemma). Let {Mn} be a martingale, and denote by Un(a, b) the number of upcrossings of a ≤ b up to time n: that is, Un(a, b) is the number of times that Mk crosses from below a to above b before time n. Then we have + (U (a, b)) ≤ E((a−Mn) ) [7]. E n (b−a)
14 Theorem 2.20. (Martingale convergence). Let {Mn} be an Fn−martingale such that one of the following hold:
+ (a) supE(|Mn|) < ∞; or (b) supE((Mn) ) < ∞ ; or n n − 1 (c) supE((Mn) ) < ∞. Then there exists an F∞−measurable random variable M∞ ∈ L , n where F∞ = σ {Fn : n = 1, 2,...}, such that Mn → M∞ a.s.
Theorem 2.21. Let Mt be martingale, i.e., E(Mt | Fs) = Ms a.s. for any s ≤ t, and assume that Mt has continuous sample paths. If any of the following conditios hold:
+ − (a) supE(|Mt|) < ∞; or (b) supE((Mt) ) < ∞ ; or (c) supE((Mt) ) < ∞; then there t t t 1 exists an F∞−measurable random variable M∞ ∈ L s.t. such that Mt → M∞ a.s
2.2.4 Stopping Times and Optional Stopping
Definition 2.22. An (Fn−) stopping time is a random time τ :Ω → {0, 1,..., ∞} such that {ω ∈ Ω: τ (ω) ≤ n} ∈ Fn for every n.
Using the notion of a stopping time, we can define stopped processes as:
Definition 2.23. Let {Xn} be a stochastic process and τ < ∞ be a stopping time. Then
Xτ denotes the random variable Xτ(ω) (ω): ie., this is the process Xn evaluated at τ. For 0 any stopping time τ, the stochastic process Xn (ω) = Xn∧τ(ω) (ω)) is called the stopped 0 0 process: i.e., Xn = Xn for n < τ, and Xn = Xn for n ≥ τ [7].
Definition 2.24. Let (Fn) be a filtration and let τ be a stopping time. By definition,
Fτ = {A ∈ F∞ : A ∩ {τ ≤ n} ∈ Fn for all n} is the σ − algebra of events that occur before time τ (recall that F∞ = σ {Fn : n = 1, 2,...}) . If τ < ∞ a.s., then Xτ is well defined and
Fτ − measurable [7].
Let Xn be a martingale (or a super- or submartingale). By the above representation for the stopped process, it is evident that even the stopped process is a martingale (or super- or submartingale, respectively) [7].
15 Lemma 2.25. If Mn is a martingale (or super-, submartingale) and τ is a stopping time, then Mn∧τ is again a martingale (or supermartingale, submartingale, respectively) [7].
Theorem 2.26. (Optional stopping). Let Mn be a martingale, and let τ < ∞ be a stopping time. Then E (Mτ ) = E (M0) holds under any of the following conditions: (a) τ < K a.s. for some K ∈ N;
(b) |Mn| ≤ K for some K ∈ [0, ∞] and all n;
(c) |Mn − Mn−1| ≤ K a.s. for some K ∈ [0, ∞] and all n, and E (τ) < ∞. If Mn is a supermartingale, then under the above conditions E (Mτ ) ≤ E (M0) [7].
2.2.5 The Wiener Process and White Noise
Brownian motion is usually described as the limit of a random walk as the time step and mean square displacement per time step converge to zero. In this section we will see that this limit actually coincides with a well defined stochastic process called the Wiener process and we will study its most important properties [7].
2.2.5.1 Basic properties and Uniqueness
The Wiener process is the limit as N → ∞ of the random walk
bNtc X ξn xt (N) = √ , n=1 N
where ξn are i.i.d. random variables with zero mean and unit variance.
Lemma 2.27. (Finite dimensional distributions). For any finite set of times t1 < t2 < ··· < tn, n < ∞ , the n−dimensional random variable (xt1 (N), . . . , xtn (N)) converges in law as N → ∞ to an n−dimensional random variable (xt1 , . . . , xtn ) such that xt1 , xt2 − xt1 , . . . , xtn − xtn−1 , are independent Gaussian random variables with zero mean and variance t1, t2 − t1, . . . , tn − tn−1 , respectively.
16 Definition 2.28. A stochastic process Wt is called a Wiener process if
1. the finite dimensional distributions of Wt are those of Lemma 2.27; and
2. the sample paths of Wt are continuous.
1 n An Rn−valued process Wt = (Wt ,...,Wt ) is called an n−dimensional Wiener process if 1 n Wt ,...,Wt are independent Wiener processes.
0 Proposition 2.29. (Uniqueness). If Wt and W0 are two Wiener processes, then the C ([0, ∞]) −valued random variables W., W.0 :Ω → C ([0, ∞]) have the same law [7].
W Given a Wiener process Wt, we can introduce its natural filtration Ft = σ {Ws : s ≤ t}.
Definition 2.30. Let Ft be a filtration. Then a stochastic process Wt is called an
Ft−Wiener process if Wt is a Wiener process, is Ft−adapted, and Wt − Ws is independent of Fs for any t > s [7].
Lemma 2.31. An Ft−Wiener process Wt is an Ft−martingale [7].
A Wiener process is also a Markov process.
Definition 2.32. An Ft−adapted process Xt is called an Ft−Markov process if we have
E (f (Xt) |Fs) = E (f (Xt) |Xs) for all t ≥ s and all bounded measurable functions f. When X the filtration is not specified, the natural filtration Ft is implied [7].
Lemma 2.33. An Ft−Wiener process Wt is an Ft−Markov process [7].
Lemma 2.34. With unit probability, the sample paths of a Wiener process Wt are non- differentiable at any rational time t [7].
Clearly, the sample paths of Brownian motion are very rough; certainly the derivative of the Wiener process cannot be a sensible stochastic process [7]. It, therefore, confirms the
17 fact that white noise is not a stochastic process. Another measure of the irregularity of the sample paths of the Wiener process is their total variation [7]. For any real-valued function f(t), the total variation of f on the interval t ∈ [a, b] is defined as
k X TV (f, a, b) = sup sup |f (ti+1) − f (ti)| , k≥0 (ti)∈P (k, a, b) i=0
where P (k, a, b) denotes the set of all partitions a = t0 < t1 < ··· < tk < tk+1 = b.
Lemma 2.35. With unit probability, TV (W., a, b) = ∞ for any a < b. In other words, the sample paths of the Wiener process are a.s. of infinite variation [7].
2.2.6 Existence: A Multiscale Construction
We are finally ready to construct a Wiener process.
Lemma 2.36. Let {Wt : t ∈ [0, 1]} be a stochastic process on the probability space (Ω, F,P ). 0 0 0 Then there exists a stochastic process {Wt : t ∈ [0, ∞]} on a probability space (Ω , F ,P ) for all t.
The second simplification is to make our random walks have continuous sample paths, unlike xt(N) which has jumps [7].
Lemma 2.37. Let fn (t) n = 1, 2,... be a sequence of continuous functions on t ∈ [0, 1] that converge uniformly to some function f (t) , i.e., sup | fn (t) − f (t)| → 0 as n → ∞. t∈[0,1] Then f (t) must be a continuous function [7].
Lemma 2.38. Let fn (t) n = 1, 2,... be a sequence of continuous functions on t ∈ [0, 1] , P such that sup | fn+1 (t) − fn (t)| < ∞. Then fn (t) converge uniformly to some continu- n t∈[0,1] ous function f (t) [7].
Theorem 2.39. There exists a Wiener process Wt on some probability space (Ω, F,P ) [7].
18 2.2.7 White Noise
White noise is generally defined as follows: it is a Gaussian “stochastic process” ξt with zero mean and covariance E (ξsξt) = δ (t − s) , where δ (·) is Dirac’s delta-“function” [7]. The delta function is defined by the relation
f (s) δ (s) ds = f (0) , ˆ where f is an element in a suitable space of test functions. The simplest space of test
∞ functions is the space C0 of smooth functions of compact support [7]. Obviously, ξt is not a stochastic process, since its covariance is not a function. However, we could think of ξt as an object whose sample paths are themselves generalized functions [7]. To make sense of this, we have to define the properties of white noise when integrated against a test function [7]. So let us integrate the defining properties of white noise against test functions:
E (ξ (f)) = 0 and
(ξ (f)) ≡ f (s) ξsds g (t) ξtdt E E R+ R+ ´ ´ = f (s) g (t) δ (t − s) dsdt R+×R+ ´ = f (t) g (t) dt R+ ´ ≡ hf, gi .
In addition, the fact that ξt is a Gaussian “process” implies that ξ (f) should be a Gaussian random variable for any test function f. So we can now define white noise as a generalized
∞ stochastic process: it is a random linear functional ξ on C0 such that ξ (f) is Gaussian, ∞ E (ξ (f)) = 0 and E (ξ (f)) ξ (g) = hf, gi for every f, g ∈ C0 . Given a Wiener process Wt, the stochastic integral ∞ ∞ ξ (f) = f (t) dWt, f ∈ C0 , ˆ0 satisfies the definition of white noise as a generalized stochastic process.
19 ∞ Lemma 2.40. The stochastic integral of f ∈ C0 with respect to the Wiener process Wt (as defined through integration by parts) is a white noise functional [7].
2.3 The Stochastic Integral
Let Ω, F, {Ft}t∈[0, ∞] ,P be a filtered probability space and a Ft−Wiener process. We are going to define stochastic integrals with respect to Wt.
2 Lemma 2.41. Let X. ∈ L (µT × P ) , and suppose there exists a sequence of Ft−adapted n 2 simple processes X. ∈ L (µT × P ) such that
T kX.n − X.k2 = (Xn − X )2 dt n→→∞ 0. 2,µT ×P E t t ˆ0
Then I (X.) can be defined as the limit in L2 (P ) of the simple integrals I (X.n) , and the definition does not depend on the choice of simple approximations X.n .
2 Lemma 2.42. Let X. ∈ L (µT × P ) be Ft−adapted. Then there exists a sequence of F −adapted simple processes X. ∈ L2 (µ × P ) such that kX.n − X.k → 0. t T 2,µT ×P
Definition 2.43. (Elementary Itˆointegral). Let Xt be any Ft−adapted process in 2 2 L (µT × P ). Then the Itˆointegral I (X.) , defined as the limit in L (P ) of simple integrals n n I (X. ) , exists and is unique (i.e., is independent of the choice of Xt ).
2.3.0.1 Continuous sample paths
n 2 Let Xt be an Ft−adapted simple process in L (µT × P ) with jump times ti . For any time t ≤ T , we define the simple integral
n t n It (X. ) = 0 Xs dWs T´ n = 0 Xs Is≤tdWs N ´ = PXn W − W . ti ti+1∧t ti∧t i=0
20 n The stochastic process It(X. ) has continuous sample paths; this follows immediately from the fact that the Wiener process has continuous sample paths [7].
n Lemma 2.44. It(X. ) is an Ft−martingale [7].
2 Lemma 2.45. Let Xt be an Ft−adapted rocess in L (µT × P ) . Then the Itˆointegral n It(X. ), ∈ [0,T ] can be chosen to have continuous sample paths [7].
2.3.0.2 Localization
T 2 Lemma 2.46. For any Ft−adapted process X. ∈ T <∞ L (µT × P ) , we can define uniquely the Itˆointegral It(X.) as an Ft−adapted stochastic process on [0, ∞] with con- tinuous sample paths [7].
T 2 Lemma 2.47. Let Xt be an Ft−adapted process in T <∞ L (µT × P ) , and let τ be an
Ft−stopping time. Then It∧τ (X.) = It (X.I.¡τ ) [7].
Lemma 2.48. Let Xt be an Ft−adapted process which admits a localizing sequence τn.
Then It(X.) is uniquely defined as an Ft−adapted stochastic process on [0, ∞] with con- tinuous sample paths and is independent of the choice of localizing sequence [7].
Definition 2.49. (Itˆointegral). Let Xt be any Ft−adapted stochastic process with
T 2 P Xt dt < = 1 for all T < ∞. ˆ0
Then the Itˆointegral t It (X.) = XsdWs ˆ0 is uniquely defined, by localization and the choice of a continuous modification, as an
Ft−adapted stochastic process on [0, ∞] with continuous sample paths [7].
21 2.3.1 Some Elementary Properties
Lemma 2.50. (Linearity). Let Xt and Yt be Itˆointegrable processes, and let α, β ∈ R.
Then It (αX. + βY.) = αIt (X.) + βIt(Y.) [7].
Lemma 2.51. Let Xt be Itˆointegrable and let τ be an Ft−stopping time. Then
t∧τ t Xs dWs = XsIs¡τ dWs. ˆ0 ˆ0
T 2 Lemma 2.52. Let X. ∈ T <∞ L (µT × P ) . Then for any T < ∞
T " T 2# T 2 E XtdWt = 0, E XtdWt = E Xt dt , ˆ0 ˆ0 ˆ0
and moreover It (X.) is an Ft−martingale [7].
n 2 n Corollary 2.53. If X. → X. in L (µT × P ) , then It (X. ) → It (X.) in L (P ) . Moreover, n if the convergence is fast enough, then It (X. ) → It (X.) a.s. [7].
Definition 2.54. An Ft−measurable process Xt is called an Ft−local martingale if there exists a sequence of Ft−stopping times τn % ∞such that Xt∧τn is a martingale for every n. The sequence τn is called a reducing sequence for Xt [7].
Lemma 2.55. Any Itˆointegral It (X.) is a local martingale [7].
22 2.3.2 The ItˆoCalculus
Let us work on a filtered probability space Ω, F, {Ft}t∈[0, ∞] ,P on which we have defined 1 m i an m−dimensional Ft−Wiener process Wt = (Wt ,...,W t ) (i.e., Wt are independent
Ft−Wiener processes). We consider Ft−adapted processes X1,...,Xn of the form
m t X t Xi = Xi + F ids + GijdW j, t 0 ˆ s ˆ s s 0 j=1 0
i ij where Fs , Gs are Ft−progressively measurable processes that satisfy
t t i ij2 Fs ds < ∞, Gs ds < ∞ a.s ∀ t < ∞, ∀i, j ˆ0 ˆ0
1 n We call Xt = (Xt ,...,Xt ) an n−dimensional Itˆoprocess [7].
1 n Definition 2.56. A process Xt = (Xt ,...,Xt ) satisfying the above conditions is called an n−dimensional Itˆoprocess. It is also denoted as
t t Xt = X0 + Fsds + GsdWs. ˆ0 ˆ0
Theorem 2.57. (Itˆorule). Let u : [0, ∞] × Rn → R, be a function such that u(t, x) is 1 2 C with respect to t and C with respect to x. Then u(t, Xt) is an Itˆoprocess itself:
n m P P t ik k u (t, Xt) = u (0,X0) + 0 ui (s, Xs) Gs dWs i=1k=1 ´ ( n n m ) t 0 P i 1 P P ik jk + 0 u (s, Xs) + ui (s, Xs) Fs + 2 uij (s, Xs) Gs Gs ds, ´ i=1 i,j=1k=1
0 ∂u(t, x) ∂u(t, x) where we have written u (t, x) = ∂t and ui(t, x) = ∂xi [7].
23 Remark 2.58. (Itˆodifferentials). We will often use another notation for the Itˆoprocess, particularly when dealing with stochastic differential equations [7]:
dXt = Ftdt + GtdW t.
2.3.3 Girsanov’s Theorem
Girsanov’s theorem, tells us what happens to the Wiener process under a change of measure.
Theorem 2.59. (Girsanov). Let Wt be an m−dimensional Ft−Wiener process on the probability space Ω, F, {Ft}t∈[0, ∞] ,P , and let Xt be an Itˆoprocess of the form
t Xt = Fsds + Wt, t ∈ [0,T ] . ˆ0
Suppose furthermore that Ft is Itˆointegrable, and define
T T ∗ 1 2 Λ = exp − (Fs) dWs − kFsk ds ˆ0 2 ˆ0
∗ 1 1 n n ((Fs) dWs = Fs dWs + ··· + Fs dWs ) . If Novikov’s condition
T 1 2 Ep exp kFsk ds < ∞ 2 ˆ0
is satisfied, then {Xt}t∈[0,T ] is an Ft−Wiener process under Q (A) = Ep (ΛIA) .
Lemma 2.60. Let Mt, t ∈ [0,T ] be a nonnegative local martingale. Then Mt is a super- martingale. In particular, if E(MT ) = E(M0), then Mt is a martingale [7].
Lemma 2.61. Let Ft be Itˆointegrable and let Wt be a Wiener process. Then
s t t 1 1 2 E exp FsdWs ≤ E exp (Fs) dWs . 2 ˆ0 2 ˆ0
24 Lemma 2.62. Let Mt be a nonnegative local martingale and let τn be a reducing sequence.
If sup kMT ∧τn kp < ∞ for some p > 1, then {Mt}t∈[0,T ] is a martingale [7]. n
Theorem 2.63. For Itˆointegrable Ft, define the local martingale
t t ∗ 1 2 ξt (F.) = exp (Fs) dWs − kFsk ds . ˆ0 2 ˆ0
Suppose furthermore that the following condition is satisfied:
t 1 2 E exp kFsk ds = K < ∞. 2 ˆ0
Then {ξt (F.)}t∈[0,T ] is in fact a martingale [7].
2.3.4 The Martingale Representation Theorem
W Theorem 2.64. (Martingale representation). Let Mt be an Ft −martingale such that 2 W 2 MT ∈ L (P ). Then for a unique Ft −adapted process {Ht}t∈[0,T ] in L (µT × P )
t Mt = M0 + HsdWs a.s. for all t ∈ [0,T ] , ˆ0
where the uniqueness of Ht is meant up to a µT × P − null set [7].
Actually, the theorem is a trivial corollary of the following result.
W Theorem 2.65. (Itˆorepresentation). Let X be an Ft −measurable random variable 2 W 2 in L (P ). Then for a unique Ft −adapted process {Ht}t∈[0,T ] in L (µT × P )
T X = E (X) + HsdWs a.s., ˆ0
where the uniqueness of Ht is meant up to a µT × P − null set [7].
25 W Lemma 2.66. Introduce the following class of Ft −measurable random variables:
∞ S = {f(Wt1 , ..., Wtn } : n < ∞, t1, . . . , tn ∈ [0,T ] , f ∈ C0
∞ (recall that C0 is the class of smooth functions with compact support). Then for any > 0 W 2 and Ft −measurable X ∈ L (P ), there is a Y ∈ S such that kX − Y k2 < [7].
Lemma 2.67. (Le´vy’s upward theorem). Let X ∈ L2 (P ) be G-measurable, and let
2 Gn be a filtration such that G = σ {Gn}. Then E(X|Gn) → X a.s. and in L (P ).
W Lemma 2.68. (Approximate Itˆorepresentation). For any Y ∈ S, there is an Ft − 2 adapted process Ht in L (µT × P ) such that
T Y = E (Y ) + HsdWs. ˆ0
W 2 In particular, this implies that any Ft −measurable random variable X ∈ L (P ) can be approximated arbitrarily closely in L2 (P ) by an Itoˆ integral.
26 Chapter 3: Stochastic Control and Dynamic Programming
In this Chapter, we assume that the filtration F is the P −augmentation of the canonical
filtration of the Brownian motion Wt [9]. In this Chapter we are following [9] quite closely and do not claim authorship. Our thrust was only to summarize. We will also denote by
d S := [0,T ) × R where T ∈ [0, ∞] .
The set S is called the parabolic interior of the state space [9]. We will denote by S := cl (S) its closure, i.e. S = [0,T ] × Rd for finite T, and S = S for T = ∞.
3.1 Stochastic Control Problems in Standard Form
Control processes. Given a subset U of Rk, we denote by U the set of all progressively measurable processes v = {vt, t < T } valued in U. The elements of U are called control processes.
Controlled process.
b :(t, x, u) ∈ S × U → b (t, x, u) ∈ Rd and
σ :(t, x, u) ∈ S × U → σ (t, x, u) ∈ MR (n, d) be two continuous functions satisfying the conditions
|b (t, x, u) − b (t, y, u)| + |σ (t, x, u) − σ (t, y, u)| ≤ K |x − y| , (3.1.1)
b (t, x, u) + |σ (t, x, u)| ≤ K (1 + |x| + |u|) . (3.1.2)
27 for some constant K independent of (t, x, y, u). For each control process v ∈ U, we consider the controlled stochastic differential equation [9]:
dXt = b (t, Xt, vt) dt + σ (t, Xt, vt) dWt. (3.1.3)
If the above equation has a unique solution X, for a given initial data, then the process X is called the controlled process, as its dynamics is driven by the action of the control process v [9].
We shall be working with the following subclass of control processes :
2 U0 := U ∩ H , (3.1.4) where H2 is the collection of all progressively measurable processes with finite L2 (Ω × [0,T )) norm. Then, for every finite maturity T 0 ≤ T, it follows from the above uniform Lipschitz condition on the coefficients b and σ that [9]
" T 0 # 2 d E |b| + |σ| (s, x, vs) ds < ∞ for all v ∈ U0, x ∈ R , ˆ0
which guarantees the existence of a controlled process on the time interval [0,T0] for each given initial condition and control [9].
2 Theorem 3.1. Let v ∈ U0 be a control process, and ξ ∈ L (P) be an F0-measurable ran- dom variable. Then, there exists a unique F-adapted process Xv satisfying 3.1.3 together
v with the initial condition X0 = ξ. Moreover for every T > 0, there is a constant C > 0 such that v 2 2 Ct E sup |X0 | < C 1 + E |ξ| e for all t ∈ [0,T ). (3.1.5) 0≤s≤t
28 Gain functional. Let
d d f, k : [0,T ) × R × U → R and g : R → R
− be given functions. We assume that f, k are continuous and kk k∞ < ∞ (i.e. max (−k, 0) is uniformly bounded). Moreover, we assume that f and g satisfy the quadratic growth condition [9]:
|f (t, x, u)| + |g (x)| ≤ K 1 + |u| + |x|2 , for some constant K independent of (t, x, u). We define the gain function J on
[0,T ] × Rd × U by [9]:
T v t,x,v v t,x,v J (t, x, v) := E β (t, s) f s, Xs , vs ds + β (t, T ) g XT 1T <∞ , ˆt when this expression is meaningful, where
s t,x,v v − t k(r,Xr ,vr)dr β (t, s) := e ´ ,
t,x,v and {Xs , s, t} is the solution of 3.1.3 with control process ν and initial condition t,x,v Xt = x.
Admissible control processes. In the finite horizon case T < ∞, the quadratic growth condition on f and g together with the bound on k− ensure that J (t, x, v) is well-defined for all control process v ∈ U0. We then define the set of admissible controls in this case by
U0 [9].
More attention is needed for the infinite horizon case. In particular, the discount term k needs to play a role to ensure the finiteness of the integral. In this setting the largest set
29 of admissible control processes is given by
∞ v t,x,v 2 U0 := v ∈ U : E β (t, s) 1 + Xs + |vs| ds < ∞ for all x when T = ∞ ˆ0
The stochastic control problem. Consider the optimization problem
V (t, x) := sup (t, x, v) for (t, s) ∈ S. v∈U0
Our main concern is to describe the local behavior of the value function V by means of the so-called dynamic programming equation, or Hamilton-Jacobi- Bellman equation [9].
3.2 The Dynamic Programming Principle
3.2.1 A weak Dynamic Programming Principle
The dynamic programming principle is the main tool in the theory of stochastic control [9]. We denote:
0 0 ∗ 0 0 V∗ (t, x) := lim inf V (t , x ) and V (t, x) := lim sup V (t , x ) , (t0,x0)→(t,x) (t0,x0)→(t,x) for all (t, x) ∈ S.
v Theorem 3.2. Assume that V is locally bounded. Let (t, x) ∈ S be fixed. Let {θ , v ∈ Ut} be a family of finite stopping times independent of Ft with values in [t, T ] . Then:
θv v t,x,v v v v t,x,v V (t, x) ≤ supE β (t, s) f s, Xs , vs ds + β (t, θ ) V∗ θ ,Xθv . v∈Ut ˆt
30 v v ∞ Assume further that g is lower-semicontinuous and θ ,Xt,x1[t,θv] is L -bounded for all v ∈ Ut . Then
θv v t,x,v v v ∗ v t,x,v V (t, x) ≥ supE β (t, s) f s, Xs , vs ds + β (t, θ ) V θ ,Xθv . v∈Ut ˆt
3.3 The Dynamic Programming Equation
The dynamic programming equation is the infinitesimal counterpart of the dynamic pro- gramming principle. It is also widely called the Hamilton-Jacobi-Bellman equation. In this section, we shall derive it under strong smoothness assumptions on the value function [9].
Let Sd be the set of all d × d symmetric matrices with real coefficients, and define the map d H : S × R × R × Sd by : H(t, x, r, p, γ) 1 h i := sup −k(t, x, u)r + b(t, x, u) · p + Tr σσT (t, x, u) γ + f (t, x, u) u∈U 2
We also need to introduce the linear second order operator Lu associated to the the con-
u u trolled process {β (0, t) Xt , t ≥ 0} controlled by the constant control process u :
Luϕ (t, x) := −k(t, x, u)ϕ(t, x) + b(t, x, u) · Dϕ(t, x) 1 h T 2 i + + 2 Tr σσ (t, x, u) D ϕ(t, x) , where D and D2 denote the gradient and the Hessian operators with respect to the x variable. With this notation, we have by Itˆo’s formula:
v v v v s v vr v β (0, s) ϕ (s, Xs ) − β (0, t) ϕ (t, Xt ) = t β (0, r)(∂t + L ) ϕ (r, Xr ) dr s´ v v v + t β (0, r) Dϕ (r, Xr ) · ϕ (r, Xr , vr) dWr ´
1,2 d for every s ≥ t and smooth function ϕ ∈ C [t, s] , R and each admissible control process v ∈ U0.
31 1,2 d Proposition 3.3. Assume the value function V ∈ C [0,T ), R , and let the coefficients k (·, ·, u) and f (·, ·, u) be continuous in (t, x) for all fixed u ∈ U. Then, for all (t, x) ∈ S :
2 −∂tV (t, x) − H(t, x, V (t, x) ,DV (t, x) ,D V (t, x)) ≥ 0. (3.3.1)
1,2 d 2 Proposition 3.4. Assume V ∈ C [0,T ), R and H(., V.DV, D V ) > −∞. Assume further that k is bounded and the function H is continuous. Then, for all (t, x) ∈ S :
2 −∂tV (t, x) − H(t, x, V (t, x) ,DV (t, x) ,D V (t, x)) ≤ 0. (3.3.2)
As a consequence of Propositions 3.3 and 3.4, we have the main result of this section:
Theorem 3.5. Let the conditions of Propositions 3.3 and 3.4 hold. Then, the value function V solves the Hamilton-Jacobi-Bellman equation
2 −∂tV − H(., V.DV, D V ) = 0 on S. (3.3.3)
Note: The value function should not be expected to be smooth in general.
3.3.1 Continuity of the Value Function for Bounded Controls
Proposition 3.6. Let f = k ≡ 0, T < ∞, and assume that g is Lipschitz continuous. Then:
1. V is Lipschitz in x, uniformly in t.
1 2. Assume further that U is bounded. Then V is 2 -H¨older-continuous in t, and there is a constant C ≥ 0 such that:
0 p 0 d |V (t, x) − V (t , x)| ≤ C (1 + |x|) |t − t0|; t, t ∈ [0,T ] , x ∈ R .
32 Chapter 4: Optimal Stopping and Dynamic Programming
In this Chapter, our goal is to derive similar results to those obtained in the previous chapter for standard stochastic control problems in the context of optimal stopping problems. In this Chapter, we are closely following [9]. Our thrust is only to summarize.
4.1 Optimal Stopping Problems
For 0 ≤ t ≤ T < ∞, we denote by T[t,T ] the collection of all F-stopping times with values in [t, T ]. We also recall the notation S := [0,T ) × Rn for the parabolic state space of the underlying state process X defined by the stochastic differential equation:
dXt = b(t, Xt)dt + σ(t, Xt)dWt, (4.1.1)
n where b and σ are defined on S and take values in R and Sn, respectively. We assume that b and σ satisfy the usual Lipschitz and linear growth conditions so that the above SDE has a unique strong solution [9].
The infinitesimal generator of the Markov diffusion process X is denoted by
1 h i A := b · D + Tr σσTD2ϕ . ϕ ϕ 2
Let g be a continuous function from Rn to R, and assume that:
E sup |g (Xt)| < ∞. (4.1.2) 0≤t≤T
For instance, if g has polynomial growth, the previous integrability condition is automati-
33 cally satisfied. Under this condition, the criterion:
t,x J (t, x, τ) := E g Xτ (4.1.3)
t,x is well-defined for all (t, x) ∈ S and τ ∈ T[t,T ]. Here, X denotes the unique strong solution t,x of 4.1.3 with initial condition Xt = x.
The optimal stopping problem is now defined by:
V (t, x) := sup J (t , x, τ) for all (t, s ∈ S) (4.1.4) τ∈T[t,T ]
A stopping timeτ ˆ ∈ T[t,T ] is called an optimal stopping rule if V (t, x) = J (t , x, τˆ) .
The set S := {(t, x): V (t, x) = g (x)} (4.1.5) is called the stopping region and is of particular interest: whenever the state is in this region, it is optimal to stop immediately. Its complement Sc is called the continuation region [9].
4.2 The Dynamic Programming Principle
t Theorem 4.1. Assume that V is locally bounded. For (t, x) ∈ S, let θ ∈ T[t,T ] be a t,x stopping time such that Xθ is bounded. Then:
t,x ∗ t,x V (t, x) ≤ sup E 1{τ<0}g Xτ + 1{τ≥0}V θ, Xθv , (4.2.1) t τ∈T[t,T ]
t,x t,x V (t, x) ≥ sup E 1{τ<0}g Xτ + 1{τ≥0}V∗ θ, Xθv . (4.2.2) t τ∈T[t,T ]
34 4.3 The Dynamic Programming Equation
Theorem 4.2. Assume that V ∈ C1,2 ([0,T ), Rn) , and let g : Rn → R be continuous. Then V solves the obstacle problem:
min {− (∂t + A) V,V − g} = 0 on S. (4.3.1)
4.4 Regularity of the Value Function
4.4.1 Finite Horizon Optimal Stopping
In this subsection, we consider the case T < ∞. Similar to the continuity result of Propo- sition 3.5 for the stochastic control framework, we have:
Proposition 4.3. Assume g is Lipschitz-continuous, and let T < ∞. Then, there is a constant C such that:
|V (t, x) − V (t0, x0)| ≤ C |x − x0| + p|t − t0| for all (t, x) , (t,0 x0) ∈ S.
35 Chapter 5: Solving Control Problems by Verification
In this Chapter, we present a general argument, based on Itˆo’sformula, which allows to show that some “guess” of the value function is indeed equal to the unknown value function. Namely, given a smooth solution v of the dynamic programming equation, we give sufficient conditions which allow to conclude that v coincides with the value function V . This is the so-called verification argument [9]. In this Chapter we are following [9] quite closely and do not claim authorship.
5.1 The Verification Argument for Stochastic Control Problems
We recall the stochastic control problem formulation of Section 3.1. The set of admissible control processes U0 ⊂ U is the collection of all progressively measurable processes with k values in the subset U ⊂ R . For every admissible control process v ∈ U0, the controlled process is defined by the stochastic differential equation:
v v v dXt = b(t, Xt , vt)dt + σ(t, Xt , vt)dWt.
The gain criterion is given by
T v t,x,v v t,x,v J (t, x, v) := E β (t, s) f s, Xs , vs ds + β (t, T ) g XT ˆt with s t,x,v v − t k(r, Xs , vr)dr β (t, s) := e ´ .
The stochastic control problem is defined by the value function:
V (t, x) := supJ (t, x, v) , for (t, s) ∈ S. (5.1.1) v∈U0
36 We follow the notations of Section 3.3. We recall the Hamiltonian
d H : S × R × R × Sd defined by :
H(t, x, r, p, γ) 1 h i := sup −k(t, x, u)r + b(t, x, u) · p + Tr σσT (t, x, u) γ + f (t, x, u) , u∈U 2 where b and σ satisfy the conditions (3.1.1) − (3.1.2), and the coefficients f and k are measurable. From the results of the previous section, the dynamic programming equation corresponding to the stochastic control problem (4.1.1) is:
2 −∂tv − H(., v, Dv, D v) = 0 and v(T,.) = g. (5.1.2)
A function v will be called a supersolution (resp. subsolution) of Equation (4.1.2) if
2 −∂tv − H(., v, Dv, D v) ≥ (resp. ≤) 0 and v (T,.) ≥ (resp. ≤) g.
The proof of the subsequent result will make use of the following linear second-order oper- ator Luϕ (t, x) := −k(t, x, u)ϕ(t, x) + b(t, x, u) · Dϕ(t, x) 1 h T 2 i + + 2 Tr σσ (t, x, u) D ϕ(t, x) ,
u u which corresponds to the controlled process {β (0, t) Xt , t ≥ 0} controlled by the constant control process u, in the sense that
v v v v s v vr v β (0, s) ϕ (s, Xs ) − β (0, t) ϕ (t, Xt ) = t β (0, r)(∂t + L ) ϕ (r, Xr ) dr s´ v v v + t β (0, r) Dϕ (r, Xr ) · ϕ (r, Xr , vr) dWr ´
1,2 d for every t ≤ s and smooth function ϕ ∈ C [t, s] , R and each admissible control process v ∈ U0. The last expression is an immediate application of Itˆo’sformula.
37 1,2 d d Theorem 5.1. Let T < ∞, and v ∈ C [0,T ), R ∩ C [0,T ] × R . Assume that − kk k∞ < ∞ and v and f have quadratic growth, i.e. there is a constant C such that