The Δ− Sensitivity and Its Application to Stochastic Optimal Control of Nonlinear Diffusions
Total Page:16
File Type:pdf, Size:1020Kb
The δ− sensitivity and its application to stochastic optimal control of nonlinear diffusions. Evangelos A. Theodorou1 and Emo Todorov2 Abstract— We provide optimal control laws by using tools sample paths, financial applications require the computation from stochastic calculus of variations and the mathematical of its differential with respect to the initial condition xt0 (δ- concept of δ−sensitivity. The analysis relies on logarithmic delta sensitivity), drift b(x)(ρ-rho sensitivity) and diffusion transformations of the value functions and the use of linearly solvable Partial Differential Equations(PDEs). We derive the σ(x) (ν-vega sensitivity) of the stochastic dynamics in (2). corresponding optimal control as a function of the δ−sensitivity The contribution of this work is to show that the of the logarithmic transformation of the value function for the δ−sensitivity appears in the computation of the optimal con- case of nonlinear diffusion processes affine in control and noise. trol under the logarithmic transformation of value function. In particular we provide explicit formulas for the stochastic I. INTRODUCTION optimal control of Markov diffusion processes affine in con- Stochastic optimal control for Markov diffusion process trol and noise. The analysis relies on the stochastic calculus affine in controls and noise has been considered in different of variations and the concept of Malliavin derivative. Related areas of science and engineering such as machine learning, work in this area traces back to [13] as well as more recent control theory and robotics [1], [3]–[6], [9]–[11], [15], [16]. work on the application of Malliavin calculus to finance [7], One of the common methodologies for solving such optimal [8]. Here we are revisiting the topic of stochastic calculus of control problems is based on the exponential transformations variations and show its applicability to find optimal control of value functions. The main idea is inspired by the fact laws. In addition we provide all the underlying assumptions that linear PDEs can be solved with the application of the regarding smoothness and differentiability conditions of the Feynman-Kac lemma which provides a tractable sampling stochastic dynamics under consideration. numerical scheme. Therefore, instead of working with the The paper is organized as follows. In section III we review initial nonlinear PDE that is the Hamilton-Jacobi-Bellman basic theorem of the stochastic calculus of variations which (HJB) equation, one can use the exponential transformation will be used for computing the δ- sensitivity. In section V of the initial value function to get the corresponding lin- we present the derivation of stochastic optimal control based ear PDE and then apply the Feynman-Kac lemma to find on the logarithmic transformation of the value function. In stochastic representations of its solution. the last section VI we conclude. Parallel to the work in control theory, machine learn- ing and robotics, researchers in the domain of financial II. NOTATION engineering developed tools based on stochastic calculus 2 of variations to find sensitivities of expectations of pay- We denote the norm x 2 L (P ) as jjxjjL2(P ) = off functions with respect to changes in the parameters of (E[x2])1=2 = (R x2(!)P (d!))1=2. Let x 2 <n and the n @h(x) @h(x) T the underlying diffusion processes [2], [7], [8], [12]. These function h(x): < ! < then rh(x) = ( @x ; :::; @x ) . n n n 1 n nn sensitivities are named Greeks because they are often denoted Let b(x): < ! < then Jb(x) : < ! < × < is T by Greek letters. More precisely, many financial applications the Jacobian of b(x) with respect to x defined as Jb(x) = involve the computation of cost function: [rbi(x); ::::; rbN (x)]. Let x = (x1; :::; xn) then the norm p 2 2 2 jjxjj2 = x1 + x2 + ::: + xn. We denote the Banach spaces Ψ(x) = E φ(~x) (1) D1;2 equipped with the corresponding norm jjF jj1;2 that is defined in the next section. where ~x is a sample path ~x = fxt1 ; xt2 ; :::; xtN g gener- ated by forward sampling of the diffusion process: III. STOCHASTIC CALCULUS OF VARIATIONS. In this section we provide the basic definitions and prop- dx = b(x)dt + σ(x(t))dw(t) (2) erties of the stochastic calculus of variations in the form of The cost function Ψ(x) can be computed with Monte- propositions and definitions. Carlo simulations. Despite the estimation of the cost based on Definition 1-(Malliavin Derivative): Let fw(t); 0 ≤ t ≤ T g be a n-dimensional brownian motion on a complete prob- 1 Evangelos A. Theodorou is a Postdoctoral Research Associate with the ability space (Ω; F; P). Let the set G of random variables Department of Computer Science and Engineering,University of Washing- ton, Seattle. [email protected] F of the form: 2 Emo Todorov is Associate professor with the Departments of Computer Z 1 Z 1 Science and Engineering, and Applied Math, University of Washington, F = f h (t)dw(t); ::: h (t)dw(t) (3) [email protected] 1 n Seattle. 0 0 with f 2 G(<n). The set G(<n) consists of infinitely the associated first variation process defined by the stochastic differentiable and rapidly decreasing function on <n and differential equation: 2 h1; h2; :::; hn 2 L (Ω × <+). For F 2 G the Malliavin n derivative DF of f is defined as the process fD F; 0 ≤ tg X i t dΥ(t) = Jb(x(t))Υ(t)dt+ Jσi (x(t))Υ(t)dw (t) (10) 2 2 with DtF : L (Ω × <+) ! L (<+) and expressed by the i=1 equation: where the initial condition Υ(0) = In and Jb(x(t)) and n Z 1 Z 1 J (x(t)) the jacobians of b and σ . I is the identity X @f σi i n D F = h (t)dw(t); ::: h (t)dw(t) h (t) n t @x 1 n i matrix of < , with σi(x(t)) denoting the i − th column 1 i 0 0 (4) vector σ. Then the process fx(t); t ≥ 0g belongs to D1;2 for t ≥ 0. and its Malliavin derivative is given by: −1 Definition 2: Let the norm jjF jj1;2 denoted as: Dsx(t) = Υ(t)Υ(s) σ(x(s))1s≤t; s ≥ 0 (11) Z 1 1=2 1 n 2 1=2 2 Hence, if 2 Cb (< ) then we have: jjF jj1;2 = E(F ) + E (DtF ) dt (5) 0 −1 Ds (xT ) = r (xT )Υ(t)Υ(s) σ(x(s))1s≤t (12) then D1;2 denotes the Banach space that is the completion of and also: G with respect to the norm jj jj1;2. The stochastic derivative Z T Z T operator Dt is the a closed linear operator defined in D1;2 −1 2 Ds (xt)dt = r (xt)Υ(t)Υ(s) σ(x(s))dt that takes values in L (Ω × <+). 0 s Next we provide a set of propositions regarding the use (13) of the Malliavin derivative. Proof: We consider the stochastic dynamics Proposition1-(chain rule): Let q(x): <n ! < be a dx = b(x)dt + σ(x(t))dw(t) continuously differentiable function with bounded partial derivatives and F = (F1;F2; :::; Fn) a random vector whose written in the form: components Fi 2 D1;2 then q(F ) 2 D1;2 and its Malliavin Z t Z t derivative is defined as: x(t) − x(0) = b(x)dτ + σ(x(t))dw(τ) (14) 0 0 n X @q or D q(F ) = (F ) D F (6) t @x t i=1 i Z t Z t N Proposition 2-(The fundamental theorem of Calculus) X x(t) − x(0) = b(x)dτ + σj(x(τ))dwj(τ) Let u = u(s); s 2 [0;T ] be a stochastic processes such that 0 0 j=1 R T 2 E 0 u (s)ds < 1 and assume that 8s 22 [0;T ]; u(s) 2 R T We take the derivative with respect to the initial state: D1;2 then 0 u(s)δw(s) is well defined and belongs to D1;2 and Z t T T dx(t) d Z Z − In = b(x)dτ Dt u(s)δw(s) = Dtu(s)δw(s) + u(t) (7) dx(0) 0 dx(0) 0 0 N Z t d X Proposition 3-(Duality Formula): Let F 2 be F + σ (x(τ))dw (τ) D1;2 T dx(0) j j measurable and let u be an F-adapted process with: 0 j=1 Z tN we will have that: 2 E u dt < 1 Z t 0 db(x) dx(τ) Υ(t) − In = dτ Then: 0 dx(τ) dx(0) Z Z Z t N X dσj(x(τ)) dx(τ) E F u(t)dW (t) = E u(t)D F dt (8) + dw (τ) (15) t dx(τ) dx(0) j 0 j=1 Next we provide an important proposition that shows that where the term Υ(t) ≡ dx(t) . In a more compact the for the case of Markov diffusion processes the stochastic dx(t0) derivative operator is related to the derivative of the processes equation above is written as: with respect to the initial condition. t t N Z Z X Proposition 4-(Malliavin Derivative and Diffusions Pro- Υ(t)−In = Jb(x)Υ(τ)dτ + Jσj (x)Υ(τ)dwj(τ) cesses): Let x(t); t ≥ 0 be an <n valued Itoˆ process whose 0 0 j=1 dynamics are driven by the stochastic differential equation: or in a more compact form: dx = b(x)dt + σ(x(t))dw(t) (9) N X dΥ(t) = Jb(x)Υ(t)dt+ Jσ (x)Υ(t)dwj(t); Υ(0) = I: where b(x) and σ(x) are continuously differentiable func- j j=1 tions with bounded derivatives.