Economics 2010C: Lecture 1 Introduction to Dynamic Programming

Economics 2010c: Lecture 1 Introduction to Dynamic Programming David Laibson 9/02/2014 Outline of my half-semester course: 1. Discrete time methods (Bellman Equation, Contraction Mapping Theorem, and Blackwell’s Sufficient Conditions, Numerical methods) Applications to growth, search, consumption, asset pricing • 2. Continuous time methods (Bellman Equation, Brownian Motion, Ito Process, and Ito’s Lemma) Applications to search, consumption, price-setting, investment, indus- • trial organization, asset-pricing Outline of today’s lecture: 1. Introduction to dynamic programming 2. The Bellman Equation 3. Three ways to solve the Bellman Equation 4. Application: Search and stopping problem 1 Introduction to dynamic programming. Course emphasizes methodological techniques and illustrates them through • applications. We start with discrete-time dynamic optimization. • Is optimization a ridiculous model of human behavior? Why or why not? • Today we’ll start with an -horizon stationary problem: • ∞ The Sequence Problem (cf. Stokey and Lucas) Notation: is the state vector at date (+1) is the flow payoff at date ( is ‘stationary’) is the exponential discount function is referred to as the exponential discount factor The discount rate is the rate of decline of the discount function, so ln = ≡− − h i Note that exp( )= and exp( )= − − Definition of Sequence Problem: Find () such that ∞ (0)= sup (+1) +1 =0 { }∞=0 X subject to +1 Γ() with 0 given. ∈ Remark 1.1 When I omit time subscripts, this implies that an equation holds for all relevant values of . In the statement above, +1 Γ() implies, ∈ +1 Γ() for all =0 1 2 ∈ Example 1.1 Optimal growth with log utility and Cobb-Douglas technology: ∞ sup ln() =0 { }∞=0 X subject to the constraints, 0+ +1 = and 0 given. ≥ Translate this problem into Sequence Problem notation by (1) eliminating re- dundant variables and (2) introducing constraint correspondence Γ Example 1.2 Optimal growth translated into Sequence Problem notation: ∞ (0)= sup ln( +1) +1 =0 − { }∞=0 X such that +1 [0 ] Γ() and 0 given. ∈ ≡ 2 Bellman Equation Compare Sequence Problem and Bellman Equation. Definition: Bellman Equation expresses the value function as a combination of a flow payoff and a discounted continuation payoff: ()= sup ( +1)+(+1) Γ() { } ∀ +1∈ Flow payoff is ( +1) • Current value function is () Continuation value function is (+1) • Equation holds for all (feasible) values of • We call ( ) the solution to the Bellman Equation. • · Note that any old function won’t solve the Bellman Equation. • We haven’t yet demonstrated that there exists even one function ( ) that • · will satisfy the Bellman equation. We will show that the (unique) value function defined by the Sequence • Problem is also the unique solution to the Bellman Equation. A solution to the Sequence Problem is also a solution to the Bellman Equation. ∞ (0)= sup (+1) +1 Γ() =0 ∈ X =sup ( )+ ∞ ( ) ⎧ 0 1 +1 ⎫ +1 Γ() =1 ∈ ⎨ X ⎬ =sup⎩ ( )+ ∞ 1 ( ⎭ ) ⎧ 0 1 − +1 ⎫ +1 Γ() =1 ∈ ⎨ X ⎬ =sup⎩ ( )+ sup ∞ ( ⎭ ) ⎧ 0 1 +1 +2 ⎫ 1 Γ(0) +1 Γ() =0 ∈ ⎨ ∈ X ⎬ =sup (01)+(1) Γ( ) {⎩ } ⎭ 1∈ 0 A solution to the Bellman Equation is also a solution to the Sequence Problem. (0)= sup (01)+(1) Γ( ) { } 1∈ 0 =sup (01)+ [ (12)+(2)] +1 Γ() { } ∈ . 1 =sup (01)+ + − ( 1)+ () Γ() ··· − +1∈ n o ∞ =sup (+1) +1 Γ() =0 ∈ X Sufficient condition: lim ()=0 feasible sequences (Stokey →∞ ∀ and Lucas Thm. 4.3). In summary, a solution to the Bellman Equation will also be a solution to the Sequence Problem and vice versa. Example 2.1 Optimal growth in Sequence Problem notation: ∞ (0)= sup ln( +1) +1 =0 − { }∞=0 X such that +1 [0 ] Γ() and 0 given. ∈ ≡ Optimal growth in Bellman Equation notation: ()= sup ln( +1)+(+1) Γ() { − } ∀ +1∈ 3 Solving the Bellman Equation Three methods • 1. guess a solution (that’s no typo) 2. iterate functional operator analytically (what’s a functional operator?) 3. iterate functional operator numerically Method 1 today. • Guess a function (), and then check to see that this function satisfies • the Bellman Equation at all possible values of For our growth example, guess that the solution of the growth problem • takes the form: ()= + ln() where and areconstantsforwhichweneedtofind solutions. Here value function inherits functional form of utility function (ln). • To solve for constants rewrite Bellman Equation: • ()= sup ln( +1)+(+1) Γ() { − } ∀ +1∈ + ln()= sup ln( +1)+ [ + ln(+1)] Γ() { − } ∀ +1∈ First order condition (FOC) on the right-hand-side of the Bellman Equation: ( +1) + 0(+1)=0 +1 Envelope Theorem: ( ) ()= +1 0 Heuristic Proof of Envelope Theorem: ( +1) ( +1)+1 +1 0()= + + 0(+1) +1 ( +1) ( +1) +1 = + + 0(+1) " +1 # ( ) = +1 Problem Set 1 asks you to use the FOC and the Envelope Theorem to solve for and . You will also confirm that ()= + ln() is a solution to the Bellman Equation. 4 Search and optimal stopping Example 4.1 An agent draws an offer, from a uniform distribution with support in the unit interval. The agent can either accept the offer and realize net present value (ending the game), or the agent can reject the offer and draw again a period later. All draws are independent. Rejections are costly because the agent discounts the future exponentially with discount factor . This game continues until the agent receives an offer that she is willing to accept. The Bellman equation for this problem is (relatively) easy to write: • ()=max [(+1)] (1) { } Our problem is to find the value function ( ) that solves equation (1). We’ll · also want to find the associated policy rule. Definition: A policy is a function that maps to the action space. Definition: An optimal policy achieves payoff () for all feasible . Proposition: In the search and optimal stopping problem, the threshold policy with cutoff ∗ is a best response to any continuation value function, if and only if (iff) b ∗ = [(+1)] b Proof: Optimization generates the following policy: ACCEPT iff ∗ = [(+1)] REJECT iff ∗ = [(+1)] b If = = [( )] then ACCEPT and REJECT generate the same ∗ +1 b payoff. ¥ b Find threshold so that the associated value function, • ∗ if ()= ∗ (2) if ≥ ( ∗ ≤ ∗ ) satisfies the Bellman Equation. In other words, find the value of so that () (defined in equation 2) • ∗ solves the Bellman Equation (equation 1). If = ∗ you should be indifferent between stopping and continuing. (∗)=∗ = (+1) =∗ =1 = ∗() + () =0 =∗ 1Z 1 Z = ( )2 + 2 ∗ 2 So the final result is = ( )2 +1 ∗ 2 ∗ h i which has solution 1 2 ∗ = − 1 1 µ − q − ¶ Always think about comparative statics and sensibility of the answer. Optimal threshold in stopping problem 1 0.9 converges to 1 as discount rate goes to 0 0.8 0.7 0.6 0.5 optimal threshold 0.4 0.3 converges to 0 as discount rate goes to ∞ 0.2 0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 discount rate = -ln(delta) Outline of today’s lecture: 1. Introduction to dynamic programming 2. The Bellman Equation 3. Three ways to solve the Bellman Equation 4. Application: Search and stopping problem.

Economics 2010C: Lecture 1 Introduction to Dynamic Programming

Calculating the Value Function When the Bellman Equation Cannot Be

From Single-Agent to Multi-Agent Reinforcement Learning: Foundational Concepts and Methods Learning Theory Course

Dynamic Programming for Dummies, Parts I & II

The Uncertainty Bellman Equation and Exploration

Macroeconomics: a Dynamic General Equilibrium Approach

Graduate Macro Theory II: Notes on Investment

Chapter 3 the Representative Agent Model

Q-Function Learning Methods

Optimal and Autonomous Control Using Reinforcement Learning: a Survey

Insurance and Taxation Over the Life Cycle

Irreversibility, Uncertainty, and Investment." Pindyck, Robert S

Bellman Equation