Stochastic Optimal Control in Mathematical Finance

Jan Kallsen Stochastic Optimal Control in Mathematical Finance Lecture Notes Kiel and Århus University, as of September 20, 2016 Contents 0 Motivation 4 I Discrete time 7 1 Recap of stochastic processes 8 1.1 Processes, stopping times, martingales . .8 1.2 Stochastic integration . 16 1.3 Conditional jump distribution . 22 1.4 Essential supremum . 25 2 Dynamic Programming 27 3 Optimal Stopping 36 4 Markovian situation 42 5 Stochastic Maximum Principle 47 II Continuous time 62 6 Recap of stochastic processes 64 6.1 Continuous semimartingales . 64 6.1.1 Processes, stopping times, martingales . 64 6.1.2 Brownian motion . 70 6.1.3 Quadratic variation . 71 6.1.4 Square-integrable martingales . 74 6.1.5 Stopping times . 75 6.2 Stochastic integral . 76 6.2.1 Differential notation . 82 6.2.2 Ito¯ processes . 82 6.2.3 Ito¯ diffusions . 83 6.2.4 Doléans exponential . 85 6.2.5 Martingale representation . 87 6.2.6 Change of measure . 87 2 CONTENTS 3 7 Dynamic programming 89 8 Optimal stopping 95 9 Markovian situation 100 9.1 Stochastic control . 100 9.2 Optimal Stopping . 103 10 Stochastic maximum principle 107 A Exercises 114 Bibliography 120 Chapter 0 Motivation In Mathematical Finance one often faces optimisation problems of various kinds, in particular when it comes to choosing trading strategies with in some sense maximal utility or minimal risk. The choice of an optimal exercise time of an American option belongs to this category as well. Such problems can be tackled with different methods. We distinguish two main approaches, which are discussed both is discrete and in continuous time. As a motivation we first consider the simple situation of maximising a deterministic function of one or several variables. Example 0.1 1. (Direct approach) Suppose that the goal is to maximise a function T X (x; α) 7! f(t; xt−1; αt) + g(xT ) t=1 d T T over all x = (x1; : : : ; xT ) 2 (R ) , α = (α1; : : : ; αT ) 2 A such that ∆xt := xt − xt−1 = δ(xt−1; αt); t = 1;:::;T d m d d for some given function δ : R × R ! R . The initial value x0 2 R , the state space of controls A ⊂ Rm and the objective functions f : f1;:::;T g × Rd × A ! R, g : Rd ! R are supposed to be given. The approach in Chapters 2 and 7 below corre- sponds to finding the maximum directly, without relying on smoothness or convexity of the functions f; g; δ or on topological properties of A. Rather, the idea is to reduce the problem to a sequence of simpler optimisations in just one A-valued variable αt. 2. (Lagrange multiplier approach) Since the problem above concerns constrained optimisation, Lagrange multiplier techniques may make sense. To this end, define the Lagrange function T T X X L(x; α; y) := f(t; xt−1; αt) + g(xT ) − yt(∆xt − δ(xt−1; αt)) t=1 t=1 on (Rd)T ×AT ×(Rd)T . The usual first-order conditions lead us to look for a candidate x? 2 (Rd)T , α? 2 AT , y? 2 (Rd)T satisfying ? ? ? ? (a) ∆xt = δ(xt−1; αt ) for t = 1;:::;T , where we set x0 := x0, 4 5 ? ? (b) yT = rg(xT ); ? ? ? (c) ∆yt = −∇xH(t; xt−1; αt ) for t = 1;:::;T; where we set H(t; ξ; a) := ? f(t; ξ; a) + yt δ(ξ; a) and rxH denotes the gradient of H viewed as a function of its second argument, ? ? (d) αt maximises a 7! H(t; xt−1; a) on A for t = 1;:::;T . Provided that some convexity conditions hold, a–d) are in fact sufficient for optimality of α?: Lemma 0.2 Suppose that the set A is convex, ξ 7! g(ξ), (ξ; a) 7! H(t; ξ; a), t = 1;:::;T are concave and ξ 7! g(ξ), ξ 7! H(t; ξ; a), t = 1;:::;T , a 2 A are differentiable. If Conditions a–d) hold, then (x?; α?) is optimal for the problem in Example 0.1(1). Proof. For any competitor (x; α) satisfying the constraints set h(t; ξ) := ? ? ? supa2A H(t; ξ; a). Condition d) yields h(t; xt−1) = H(t; xt−1; αt ) for t = 1;:::;T . We have T T X X ? ? ? f(t; xt−1; αt) + g(xT ) − f(t; xt−1; αt ) − g(xT ) t=1 t=1 T X ? ? ? ? = H(t; xt−1; αt) − H(t; xt−1; αt ) − yt (∆xt − ∆xt ) t=1 ? + g(xT ) − g(xT ) T X ? ≤ (H(t; xt−1; αt) − h(t; xt−1)) + h(t; xt−1) − h(t; xt−1) t=1 ? ? ? ? − yt (∆xt − ∆xt ) + rg(xT )(xT − xT ) T X ? ? ? ? ≤ rxh(t; xt−1)(xt−1 − xt−1) − yt (∆xt − ∆xt ) t=1 ? ? + rg(xT )(xT − xT ) (0.1) T X ? ? ? ? = − ∆yt (xt−1 − xt−1) − yt (∆xt − ∆xt ) t=1 ? ? + yT (xT − xT ) (0.2) ? ? = y0(x0 − x0) = 0 ? where existence of rxh(t; xt−1), inequality (0.1) as well as equation (0.2) follow from Lemma 0.3 below and the concavity of g. Under some more convexity (e.g. if δ is affine and f(t; ·; ·) is concave for t = 1;:::;T ), the Lagrange multiplier solves some dual minimisation problem. This hap- pens e.g. in the stochastic examples 5.3–5.8 in Chapter 5. 6 CHAPTER 0. MOTIVATION The following lemma is a version of the envelope theorem which makes a statement on the derivative of the maximum of a parametrised function. Lemma 0.3 Let A be a convex set, f : Rd × A ! R [ {−∞} a concave function, and d fe(x) := supa2A f(x; a), x 2 R . Then feis concave. Suppose in addition that, for some fixed ? d ? ? ? x 2 R , the optimiser a := argmaxa2Af(x ; a) exists and x 7! f(x; a ) is differentiable in x?. Then feis differentiable in x? with derivative ? ? ? Dife(x ) = Dif(x ; a ); i = 1; : : : ; d: (0.3) Proof. One easily verifies that feis concave. For h 2 Rd we have fe(x? + yh) ≥ f(x? + yh; a?) d ? ? X ? ? = f(x ; a ) + y Dif(x ; a )hi + o(y) i=1 as y 2 R tends to 0. In view of [HUL13, Proposition I.1.1.4], concavity of fe implies that we actually have d ? ? ? X ? ? fe(x + yh) ≤ f(x ; a ) + y Dif(x ; a )hi i=1 ? and hence differentiability of fein x with derivative (0.3). In the remainder of this course we consider optimisation in a dynamic stochastic setup. Green parts in these notes are skipped either because they are assumed to be known (Chap- ters 1 and 6) or for lack of time. Comments are welcome, in particular if they concern errors in this text. Part I Discrete time 7 Chapter 1 Recap of stochastic processes The theory of stochastic processes deals with random functions of time as e.g. asset prices, interest rates, or trading strategies. As is true for Mathematical Finance as well, it can be developped in both discrete and continuous time. Actual calculations are sometimes easier and more transparent in continuous-time models, but the theory typically requires less background in discrete time. 1.1 Processes, stopping times, martingales The natural starting point in probability theory is a probability space (Ω; F ;P ). The more or less abstract sample space Ω stands for the possible outcomes of the random experiment. It could e.g. contain all conceivable sample paths of a stock price process. The probability measure P assigns probabilities to subsets of outcomes. For measure theoretic reasons it is typically impossible to assign probabilities to all subsets of Ω in a consistent manner. As a way out one specifies a σ-field F , i.e. a collection of subsets of Ω which is closed under countable set operations as e.g. \; [; n; C . The probability P (F ) is defined only for events F 2 F . Random variables X are functions of the outcome ! 2 Ω. Typically its values X(!) are numbers but they may also be vectors or even functions, in which case X is a random vector resp. process. We denote by E(X), Var(X) the expected value and variance of a real-valued random variable. Accordingly, E(X), Cov(X) denote the expectation vector and covariance matrix of a random vector X. For static random experiments one needs to consider only two states of information. Before the experiment nothing precise is known about the outcome, only probabilities and expected values can be assigned. After the experiment the outcome is completely deter- mined. In dynamic random experiments as e.g. stock markets the situation is more involved. During the time interval of observation, some random events (e.g. yesterday’s stock returns) have already happened and can be considered as deterministic whereas others (e.g. tomor- rows’ stock returns) still belong to the unknown future. As time passes, more and more information is accumulated. This increasing knowledge is expressed mathematically in terms of a filtration F = (Ft)t≥0, i.e. an increasing sequence of sub-σ-fields of F . The collection of events Ft stands for the observable information up to time t. The statement F 2 Ft means that the 8 1.1. PROCESSES, STOPPING TIMES, MARTINGALES 9 random event F (e.g. F = fstock return positive at time t−1g) is no longer random at time t. We know for sure whether it is true or not. If our observable information is e.g. given by the evolution of the stock price, then Ft contains all events that can be expressed in terms of the stock price up to time t.

Stochastic Optimal Control in Mathematical Finance

STOCHASTIC CONTROL Contents 1. Introduction 1 2. Mathematical Probability 1 3. Stochastic Calculus 3 4. Diffusions 7 5. the Stoc

Upper and Lower Snell Envelopes and Robust Partial Hedging

MARKOV DECISION PROCESSES Abstract: the Theory of Markov

On Some Recent Aspects of Stochastic Control and Their Applications 507

Stochastic Control

Optimal Stopping, Smooth Pasting and the Dual Problem

Financial Mathematics

The Harrison-Pliska Story (And a Little Bit More)

A Complete Bibliography of Electronic Communications in Probability

Approximation of the Snell Envelope and American Options Prices in Dimension One

Stochastic Bridges of Linear Systems Yongxin Chen and Tryphon Georgiou

Numerical Methods for Mean Field Games and Mean Field Type Control