Quick viewing(Text Mode)

Production Functions and Productivity Slides

Production Functions and

Paul T. Scott NYU Stern [email protected]

PhD Empirical IO Fall 2017

1 / 84 Intro Overview

I Brief background on industry dynamics I Methods for estimating functions: I Olley Pakes (1995) I Ackerberg, Caves, and Frazer (2015)

I What production analysis can say about market power: I De Loecker and Warzynski (2012)

I These tools are highly relevant outside IO, notably in trade, macro, and development

2 / 84 Intro Some Questions

I Role of entry and exit in driving growth?

I Impact of events like trade liberalization and deregulation on productivity?

I Persistence of productivity within plant/firm?

I What are the factors driving plant/firm-level changes in productivity and growth?

3 / 84 Firm size The firm size distribution

I A very robust finding: the firm size distribution has a long upper tail.

I ... this holds within the vast majority of industries, countries, and after conditioning on observable characteristics.

I Typically, the size distribution is approximated with a lognormal or Pareto distribbtion.

I Broader theme: almost any variable we look at exhibits tremendous heterogeneity across firms

4 / 84 Firm size distribution Gibrat’s Law: a trivial model of growth and heterogeneity

I Gibrat’s law states that if the growth rate of a variable is independent of its size and over time, it will have a log-normal distribution in the long run.

I Let Yit denote firm i’s size (employment or output) in year t . Suppose it evolves according to the following process:

(Yi,t+1 − Yit )/Yit = εit

where εit is i.i.d. across i and t

I Then, after allowing a large group of firms to evolve for a while, the cross-sectional distribution of Yit will have a log-normal distribution.

5 / 84 Firm size distribution

Source: Cabral and Mata (2003)

6 / 84 Firm size distribution Theoretical models of industry dynamics

I While the ”Gibrat Model” of industry dynamics is way too simple to explain all the data, most modern models of heterogeneous firms are based on the assumption that each firm experiences a series of unpredictable and persistent shocks, generating lots of heterogeneity. I In modern models, rather than just having random growth/size, there is a heterogenous productivity variable that determines firm size I Jovanovic (1982), entry and exit model that explains some facts: I Small firms have higher and more variable growth rates I Smaller firms are more likely to exit I See Dunne, Roberts, and Samuelson (1989) for empirical evidence. I Hopenhayn (1992) - equilibrium model with stochastic productivity I Melitz (2003) - Equilibrium trade model in which high-productivity firms select into exporting and low-productivity firms exit.

7 / 84 Firm size distribution What is productivity, and how do we measure it?

I The two most popular measures of productivity are labor productivity and total factor productivity (TFP).

I Labor productivity is defined as the ratio of output to labor inputs (Yt /Lt ).

I TFP is defined as the residual of a production function. For example, with the Cobb-Douglas Production function

ωt α β Yt = e Lt Kt ,

which we can rewrite in logs,

yt = αlt + βkt + ωt .

TFP is ωt (lower case variables represent logs of uppercase variables).

8 / 84 Firm size distribution Concerns with productivity definitions

I Labor productivity can change due to changes in the -labor ratio without any changes in (e.g., due to changes). Consequently, TFP is typically the object of choice for studies on technological change or firm performance.

I That said, TFP is not without its own conceptual and practical limitations. I Unlike labor productivity, TFP is defined in terms of a specific functional form and does not have units. I TFP relies on measurement of capital stocks, which is typically difficult.

9 / 84 Firm size distribution Bartelsman and Doms (2003): overview

Bartelsman and Doms (2003) review some empirical work on productivity. Stylized facts:

I Large productivity dispersion across firms.

I Within firm, productivity is highly but imperfectly persistent.

I There is considerable reallocation within industries; "the aggregate data belie the tremendous turmoil underneath."

10 / 84 Firm size distribution What to make of these residuals?

I “I found the spectacle of economic models yielding large residuals rather uncomfortable, even when the issue was fudged by renaming them technical change and claiming credit for their ’measurement.’ ” – Zvi Griliches

I Bad data could be one reason could be one source of TFP dispersion, but we observe large dispersion everywhere we have data, and measured productivities are connected to real outcomes: I more productive firms are less likely to exit I more productive firms are more likely to be exporters I productivities of entrants tend to be lower than average incumbents

11 / 84 Firm size distribution Simultaneity

I yt = αlt + βkt + ωt

I Generally, we should expect input use to respond to ωt . For example, if capital is set at t − 1 and labor can be adjusted at t, we should expect labor to respond to the current realization of productivity.

I Input prices as instruments are a potential solution, but we often don’t observe them with any variance, and if they do vary, you might question whether the variation is exogenous.

12 / 84 Firm size distribution Selection

I The firms that exit are those that have low productivity draws.

I Selection will be an issue if we want to estimate how the productivity process evolves or how endogenous variables like exporter status impact productivity (e.g., because of Melitz’s selection story).

13 / 84 Olley and Pakes (1996)

"The Dynamics of Productivity in the Telecommunications Equipment Industry" Olley and Pakes (1996)

14 / 84 Olley and Pakes (1996) Overview

I Analyzes effects of deregulation in telecommunications equipment industry.

I Deregulation increases productivity, primarily through reallocation toward more productive establishments.

I Estimation approach deals with simultaneity and selection issues.

15 / 84 Olley and Pakes (1996) Background I

I AT&T had a monopoly on telecommunications services in the US throughout most of the 20th century (note: a telecommunications network is a classic example of a natural monopoly).

I Before the regulatory change, AT&T required that equipment attached to their network must be supplied by the AT&T, and virtually all of their equipment was supplied by their subsidiary, Western Electric. Thus, they leveraged their network monopoly to a monopoly on phones.

16 / 84 Olley and Pakes (1996) Background II

I A change in technology opened up new markets for telecommunications equipment (e.g., fax machines)

I Meanwhile, the FCC (regulatory agency) decided to begin allowing the connection of privately-provided devices to AT&T’s network.

I A surge of entry into telecommunications equipment manufacturing followed in the late 1960’s and 1970’s.

17 / 84 Olley and Pakes (1996) Background III I AT&T continued purchasing primarily from Western Electric into the 1980’s (although consumers were free to purchase devices from other companies). I The divestiture (breakup) of AT&T created seven regional Bell companies that were no longer tied to Western Electric, and they were prohibited from manufacturing their own equipment. I The divestiture was implemented in January 1984. Western Electric’s share dropped dramatically.

18 / 84 Olley and Pakes (1996) Entry

19 / 84 Olley and Pakes (1996) Exit

20 / 84 Olley and Pakes (1996) The model

I Incumbent firms (i) make three decisions: I Whether to exit or continue. If they exit, they receive a fixed scrap value Ψ and never return. I If they stay, they choose labor lit , I and investment iit .

I Capital accumulation:

kt+1 = (1 − δ) kt + it

I Another state variable is age: at+1 = at + 1

21 / 84 Olley and Pakes (1996) Production

I They assume the following Cobb-Douglas production function:

yit = β0 + βaait + βk kit + βl lit + ωit + ηit

where yit is output, kit is capital, lit is labor, ωit is a persistent component of productivity, and ηit is a transient shock to productivity.

I Productivity evolves according to a Markov process: F (·|ω).

I η is either measurement error, or there is no information about it when labor decisions are made.

22 / 84 Olley and Pakes (1996) Equilibrium behavior

I They assume the existence of a Markov perfect equilibrium. Market structure and prices are state variables in the MPE, but they are common across firms, so they can be absorbed into time subscripts for the value function: n Vt (ωt , at , kt ) = max Ψ, supit ≥0 πt (ωt , at , kt ) − c (it ) +βE [Vt+1 (ωt+1, at+1, kt+1) |Jt ]

where Jt represents the information set at time t.

I Equilibrium strategies can be described by functions ωt (at , kt ) and it (ωt , at , kt ).

I A firm will continue if and only if ω ≥ ωt (at , kt ). I Continuing firms invest it = it (ωt , at , kt )

23 / 84 Olley and Pakes (1996) Aside: Markov perfect equilibrium

I Loosely, it means a subgame perfect equilibrium in which strategies are functions of “real” (payoff relevant) state variables. Formally defined by Maskin and Tirole (1988)

I This rules out conditioning on variables that don’t impact present or future payoffs. For example, in the repeated prisoner’s dilemma, cooperation with grim trigger punishments is ruled out.

I Markov perfect equilibrium is to dynamic games what perpetual static Nash is to repeated games.

24 / 84 I Down: firms with high capital have lower cutoffs ωt for exit. Thus, conditional on survival, there is a negative correlation between k and ω

I Up: when productivity is high, a firm will use more labor

I How does selection due to exit bias the capital coefficient estimate?

Olley and Pakes (1996) Thinking about bias

I How does the simultaneity of the input decision bias the labor coefficient?

25 / 84 I Down: firms with high capital have lower cutoffs ωt for exit. Thus, conditional on survival, there is a negative correlation between k and ω

I How does selection due to exit bias the capital coefficient estimate?

Olley and Pakes (1996) Thinking about bias

I How does the simultaneity of the input decision bias the labor coefficient? I Up: when productivity is high, a firm will use more labor

26 / 84 I Down: firms with high capital have lower cutoffs ωt for exit. Thus, conditional on survival, there is a negative correlation between k and ω

Olley and Pakes (1996) Thinking about bias

I How does the simultaneity of the input decision bias the labor coefficient? I Up: when productivity is high, a firm will use more labor

I How does selection due to exit bias the capital coefficient estimate?

27 / 84 Olley and Pakes (1996) Thinking about bias

I How does the simultaneity of the input decision bias the labor coefficient? I Up: when productivity is high, a firm will use more labor

I How does selection due to exit bias the capital coefficient estimate?

I Down: firms with high capital have lower cutoffs ωt for exit. Thus, conditional on survival, there is a negative correlation between k and ω

28 / 84 Olley and Pakes (1996) Productivity inversion

I In a technical paper, Pakes (1994) shows that optimal investment it (ωt , at , kt ) is monotonically increasing in ωt , provided it > 0.

I Given monotonicity, optimal investment can be inverted for productivity: ωit = ht (iit , ait , kit ) .

I We’re going to talk more about the it > 0 requirement with Levinsohn and Petrin (2003).

29 / 84 Olley and Pakes (1996) First stage model

I Substituting in the inversion function,

yit = βl lit + φt (iit , ait , kit ) + ηit

where

φt (iit , ait , kit ) = β0 + βaait + βk kit + ht (iit , ait , kit )

I We can estimate this equation using a semiparametric regression. This may identify βl , but not the other coefficients.

I With Ackerberg, Caves, and Frazer (2015), we will think more carefully about what’s identifying βl , but don’t worry about it for now.

30 / 84 Olley and Pakes (1996) First stage output

ˆ I With βl , we can also estimate φ:

φˆit = yit − βˆl lit

I So far we have estimates of βl and φ. βk k and ω are both in the control function φ, and we would like to separate them. We’re going to use the Markov assumption on ω for identification.

31 / 84 Olley and Pakes (1996)

Identifying βk ,βa

I Let’s first think about how to do this without worrying about exit. Define g (ωi,t−1) = E [ωi,t |ωi,t−1] , so that ωi,t = g (ωi,t−1) + ξi,t

where ξi,t+1 is the innovation (unexpected change) to productivity.

I We can write out a second stage regression equation:

φi,t = βk kit + βaait + g (ωi,t−1) + ξi,t

and note that ωi,t−1 can also be written as a function of (βk , βa):

φi,t = βk kit + βaait + g (φi,t−1 − βk ki,t−1 − βaai,t−1) + ξi,t

32 / 84 Olley and Pakes (1996)

Identifying βk ,βa

I Second stage regression equation:

φi,t = βk kit + βaait + g (φi,t−1 − βk ki,t−1 − βaai,t−1) + ξi,t

I One way to think about this: once we specify a parametric function for g, this basically becomes OLS.

I NLLS: we can guess values of (βk , βa), (nonparametrically) estimate g conditional on those value of (βk , βa), and then back out ξi,t (βk , βa). Search over (βk , βa) to minimize sum of squares of ξi,t (βk , βa).

33 / 84 Olley and Pakes (1996) Selection

I Let Pt = Pr (χt+1 = 1|ωt+1 (kt+1, at+1) , Jt ) be the propensity score for exit.

I As long as the conditional density of ωt+1 has full support, this can be inverted to express ωt+1 = f (Pt , ωt )

34 / 84 Olley and Pakes (1996) The second stage with selection

I Write the expectation of yt+1 − βl lt+1 conditional on survival:

E [yt+1 − βl lt+1|at+1, kt+1, χt+1 = 1]

= βaat+1 + βk kt+1 + g (ωt+1, ωt )

where g (ωt+1, ωt ) = E [ωt+1|ωt , χt+1 = 1]

I Using the inversion of the selection probability, we can write

g (ωt+1, ωt ) = g (f (Pt , ωt ) , ωt )

which can be written more simply as g (Pt , ωt ).

35 / 84 Olley and Pakes (1996) Final step

I Conditional on values of (βa, βk ), we can construct an estimate of ωt = φt − βaat − βk kt

I Finally, write

yt+1 − βl lt+1 = βaat+1 + βk kt+1 + g (Pt , φt − βaat − βk kt ) +ξt+1 + ηt+1

I Again, we can use NLLS to estimate (βk , βa).

I Note that E (ξi,t li,t ) 6= 0 is what creates the need for the first stage.

36 / 84 Olley and Pakes (1996) Estimation steps

1. First stage semi-parametric regression:

yit = βl lit + φt (iit , ait , kit ) + ηit

2. Estimate propensity scores: Pt = Pr (χt+1 = 1|ωt+1 (kt+1, at+1) , Jt ) 3. Estimate remaining parameters:

yt+1 − βl lt+1 = βaat+1 + βk kt+1 + g (Pt , φt − βaat − βk kt ) +ξt+1 + ηt+1

using fact that innovation term ξt+1 is mean-uncorrelated with variables determined at t, including kt+1.

37 / 84 Olley and Pakes (1996)

I Why do within estimators have lower capital coefficients?

38 / 84 Olley and Pakes (1996)

 ˆ ˆ ˆ  I Estimate of productivity: pit = exp yit − βl lit − βk kit − βaait I Plants that eventually exit have low productivity growth I New entrants tend to have lower productivity than continuing establishments I Surviving entrants tend to have greater average productivity growth than incumbents. 39 / 84 Olley and Pakes (1996) Productivity decomposition

PNt I Aggregate productivity: pt = i=1 sit pit . I Can be decomposed as follows:

PNt pt = i=1 (¯st + ∆sit ) (¯pt + ∆pit ) PNt = Nt¯st p¯t + i=1 ∆sit ∆pit PNt =p ¯t + i=1 ∆sit ∆pit

where p¯t are unweighted mean productivity and shares in the cross-section.

I Thus, aggregate productivity decomposes into an unweighted mean and a covariance term.

40 / 84 Olley and Pakes (1996)

41 / 84 Levinsohn and Petrin (2003)

"Estimating Production Functions Using Inputs to Control for Unobservables" Levinsohn and Petrin (2003)

42 / 84 Levinsohn and Petrin (2003) Main idea

I Same general framework as Olley and Pakes (1996)

I Main idea: rather than use investment to control for unobserved productivity, use materials inputs.

I Two proposed benefits: I Investment proxy isn’t valid for plants with zero investment. Zero materials inputs typically an issue in the data. I Investments may be "lumpy" and not respond to some productivity shocks.

43 / 84 Levinsohn and Petrin (2003) Downsides of investment

I We need to drop observations with zero investment, which can lead to a substantial efficiency loss. Zero investments happen at a non-trivial rate in annual production data.

I Firms might face non-convex capital adjustment costs leading to flat regions in the i (ω) function even at positive levels of investment.

I What if investment actually happens with only partial information about productivity and then labor is set once the productivity realization is fully observed?

44 / 84 Levinsohn and Petrin (2003) OP equations

I Production function:

yt = β0 + βl lt + βk kt + ωt + ηt .

I First stage regression:

yt = βl lt + φt (it , kt ) + ηt

with φt (it , kt ) = β0 + βk kt + ωt (it , kt ). I Final regression:

∗ ∗ yt = yt − βl lt = β0 + βk kt + E [ωt |ωt−1] + ηt

∗ where ηt = ηt + (ωt − E (ωt |ωt−1)).

45 / 84 Levinsohn and Petrin (2003) LP equations

I Production function:

yt = β0 + βl lt + βk kt + βmmt + ωt + ηt

I First stage regression:

yt = βl lt + φt (mt , kt ) + ηt

with φt (mt , kt ) = β0 + βk kt + βmmt + ωt (mt , kt ). I Final regression:

∗ ∗ yt = yt − βl lt = β0 + βk kt + βmmt + E [ωt |ωt−1] + ηt

∗ where ηt = ηt + (ωt − E (ωt |ωt−1)).

46 / 84 Levinsohn and Petrin (2003) Invertability

I Just as OP require it (ωt , kt ) be an invertible function of productivity, LP require that input use mt (ωt , kt ) is an invertible function of productivity.

I LP’s monotonicity result relies on easily checked properties of the production function, and some may find this more appealing than a result which relies on a Markov perfect equilibrium.

I Unobserved input price variation may be a problem for the LP invertibility condition (but of course it could be for OP, too).

47 / 84 Levinsohn and Petrin (2003) Checking invertibility

I LP claim that ∂m  sign = sign (f f − f f ) . ∂ω ml lω ll mω I To see this, apply the Implicit function theorem to the FOC’s to get  ∂m   −1   ∂ω fmm fml fmω ∂l = − . ∂ω flm fll flω I Inverting and solving,

∂m fml flω − fll fmω ⇒ = . ∂ω fmm fml

flm fll   fmm fml I By the second-order condition for profit maximization, must flm fll be negative semidefinite. This means it has exactly two negative eigenvalues, which means its determinant is positive. Therefore, the numerator controls the sign.

48 / 84 Levinsohn and Petrin (2003) Zero inputs

Note: in OP’s industry, it was only 8% zeros.

49 / 84 Levinsohn and Petrin (2003) Differences from OP

I LP use a slightly different first stage: u s I First, they estimate E (zt |kt ) for zt = yt , lt , lt , et , ft I They then use no-intercept OLS to estimate:

s s yt − E (yt |kt , mt ) = βs (lt − E (lt |kt , mt )) u u +βs (lt − E (lt |kt , mt )) +βe (et − E (et |kt , mt )) +βf (ft − E (ft |kt , mt )) + ηt

I Second stage is similar, but they have to estimate two coefficients (βm, βk ), so they need two moments: !! kt E ξt = 0 mt−1

50 / 84 Levinsohn and Petrin (2003)

51 / 84 Ackerberg, Caves, and Frazer (2015)

"Structural Identification of Production Functions" Ackerberg, Caves, and Frazer (2015)

52 / 84 Ackerberg, Caves, and Frazer (2015) Overview

I ACF argue that Olley and Pakes’s (1996) and Levinsohn and Petrin’s (2003) approach suffer from collinearity issues.

I They propose a new approach which involves modified assumptions on the timing of input decisions and moves the identification of all coefficients of the production function to the second stage of the estimation.

53 / 84 Ackerberg, Caves, and Frazer (2015) LP’s first stage

I Levinsohn and Petrin’s first-stage regression:

−1 yit = βl lit + ft (mit , kit ) + εit .

I LP’s approach was based on the premise that materials inputs are a variable input and therefore a function of state variables:

mit = mt (ωit , kit ) ,

I They also assume that labor is a variable input (or else we would not be able to exclude it from the inversion), so

lit = lt (ωit , kit ) .

54 / 84 Ackerberg, Caves, and Frazer (2015) LP’s identification problem

I This means we can write:

 −1  −1 yit = βl lt ft (mit , kit ) , kit + ft (mit , kit ) + εit ,

−1 and since we’re being nonparametric about ft ,  −1  it should absorb βl lt ft (mit , kit ) , kit .

I There should be no variation in lit left over to identify βl .

55 / 84 Ackerberg, Caves, and Frazer (2015) Does a parametric inversion help? I Cobb-Douglas production :

yit = βk kit + βl lit + βmmit + ωit + it . I FOC for materials: p βk βl βm−1 ωit m βmKit Lit Mit e = . py I Solving for ω (parametric inversion):   ! 1 pm ωit = ln + ln − βk kit − βl lit + (1 − βm) mit βm py

I Plugging this into the production function, the βl lit terms cancel:   ! 1 pm yit = ln + ln + mit + it . βm py

. . . what identifies βl ?

56 / 84 I While optimization error in lit works econometrically, it’s not the most appealing assumption economically.

Ackerberg, Caves, and Frazer (2015) Collinearity in practice and in principle

I It could be the case that lit takes different values in the data for the same values of (mit , kit ). ACF’s argument is about collinearity in principle, given the assumptions of LP.

I Some potential sources of independent variation: (Which one works?) I unobserved variation in firm-specific input prices. I measurement error in lit or mit I optimization error in lit or mit

57 / 84 Ackerberg, Caves, and Frazer (2015) Collinearity in practice and in principle

I It could be the case that lit takes different values in the data for the same values of (mit , kit ). ACF’s argument is about collinearity in principle, given the assumptions of LP.

I Some potential sources of independent variation: (Which one works?) I unobserved variation in firm-specific input prices. I measurement error in lit or mit I optimization error in lit or mit

I While optimization error in lit works econometrically, it’s not the most appealing assumption economically.

58 / 84 Ackerberg, Caves, and Frazer (2015) Another failed solution

I Note that the whole problem comes about because labor and materials are set simultaneously. This means one way to break the collinearity is to assume they are set with respect to different information sets.

I Let’s try to break the informational equivalence with timing assumptions. Suppose:

I mit is set at time t I lit is set at time t − b with 0 < b < 1 I ω has Markovian in between sub-periods:

p (ωi,t−b|Ii,t−1) = p (ωi,t−b|ωit−1) p (ωit |Ii,t−b) = p (ωi,t |ωi,t−b)

I But this doesn’t work! And neither does having mit set first. (Why?)

59 / 84 Ackerberg, Caves, and Frazer (2015) An implausible solution

I Let’s try again: I lit is set at time t I mit is set at time t − b with 0 < b < 1 I we have a more complicated structure of productivity shocks:

yit = βl lit + βmmit + βk kit + ωi,t−b + ηit ,

p (ωi,t−b|Ii,t−1) = p (ωi,t−b|ωi,t−1) , I and there is some unobservable shock to labor prices which is realized between t − b and t. This shock must be i.i.d.

I lit has its own shock to respond to, creating independent variation, and the productivity inversion still works because the new shock is not a state variable.

I This works, but as ACF argue, it’s rather ad-hoc and difficult to motivate.

60 / 84 Ackerberg, Caves, and Frazer (2015) Collinearity in Olley Pakes

I Olley Pakes’s control function has the same collinearity issue, but ACF argue it can be avoided with assumptions which "might be a reasonable approximation to the true underlying process."

I Assume that lit is set at t − b with 0 < b < 1. ω has a Markovian between subperiods. Then:

lit = lt (ωi,t−b, kit ) ,

so we have variation in lit which is independent of (ωit , kit ).

I Note that even though lit is set before investment iit , investment won’t depend on lit because it is a static input. So the productivity inversion is unchanged.

I These timing assumptions cannot save LP, but they work well with OP.

61 / 84 Ackerberg, Caves, and Frazer (2015) ACF’s alternative procedure I

I Consider value added production function:

yit = βk kit + βl lit + ωit + it .

I ACF’s procedure is based on the same timing assumption that "saves" OP: labor chosen at t − b, slightly earlier than when materials are chosen at t.

I Point of first stage is just to get expected output:

yit = Φt (mit , kit , lit ) + it

where −1 Φt (mit , kit , lit ) = βk kit + βl lit + fit (mit , kit , lit )

... first stage no longer recovers βl .

62 / 84 Ackerberg, Caves, and Frazer (2015) ACF’s alternative procedure II

ˆ I After the first stage, we have Φit , expected output.

I We can construct a measure of productivity given coefficients:

ωˆit (βk , βl ) = Φˆ it − βk kit − βl lit

I Then, non-parametrically regressing ωˆit (βk , βl ) on ωˆi,t−1 (βk , βl ), we can construct the innovations:

ξˆit (βk , βl ) =ω ˆit (βk , βl ) − E (ˆωit (βk , βl ) |ωˆi,t−1 (βk , βl ))

63 / 84 Ackerberg, Caves, and Frazer (2015) ACF’s alternative procedure III

I Estimation relies on the following moments: ! k T −1N−1 X X ξˆ (β , β ) it it k l l t i i,t−1

I In the second stage, these two moments are used to estimate both βk and βl .

I In ACF’s framework, lit isn’t a function of ωit but of ωi,t−b. However, labor will still be correlated with part of the innovation in productivity, so we still need to use lagged labor in the moments.

I The moment with lagged labor is very much in the spirit of OP and LP, and they actually used it as an overidentifying restriction.

64 / 84 Foster, Haltiwanger, and Syverson (2008)

"Reallocation, Firm Turnover, and Efficiency: Selection on Productivity or Profitability" Foster, Haltiwanger, and Syverson (2008)

65 / 84 Foster, Haltiwanger, and Syverson (2008) Overview

I They look at some rare industries where quantity data is available, allowing them to separate physical and revenue productivity

I Findings: I Physical productivity is inversely correlated with price I Young producers charge lower prices than incumbents, meaning the literature understates entrants’ productivity advantages

66 / 84 Foster, Haltiwanger, and Syverson (2008) Measurement

I Productivity is measured as follows:

tfpit = yit − αl lit − αk kit − αmmit − αeeit

I Coefficients (α) are just taken from input shares by industry. I Different measures use different output measures y: I TFPQ uses physical output I TFP uses deflated sales (using standard industry-level deflators from NBER) I TFPR are sales deflated by mean prices observed in their data

67 / 84 Foster, Haltiwanger, and Syverson (2008) Correlations

68 / 84 Foster, Haltiwanger, and Syverson (2008) Demand

I They estimate a demand system for each industry: X ln qit = α0 + α1pit + αt YEARt + α2 ln (INCOMEmt ) + ηit t

where INCOMEmt is the income in a firm’s local market m

I They use the residuals from these regressions as a measure of demand shocks.

69 / 84 Foster, Haltiwanger, and Syverson (2008) Persistence

I correlation between traditional TFP, physical TFP is substantial, but imperfect I negative correlation between price and physical TFP

70 / 84 Foster, Haltiwanger, and Syverson (2008)

I one-standard deviation increases physical TFP and prices seem to have similar impacts on exit probabilities 71 / 84 De Loecker and Warzynski (2012)

"Markups and Firm-Level Export Status" De Loecker and Warzynski (2012)

72 / 84 De Loecker and Warzynski (2012) Overview

I Demonstrates how production function can be used to make inferences about markups

I Applied question: how do markups of exporters differ from non-exporters, and how does a firm’s productivity change when it becomes an exporter.

I Findings: I Exporters have higher markups than importers I Markups increase when a firm becomes an exporter I Note similarity to De Loecker (2011), but focus is now on exporter status rather than trade liberalization

73 / 84 De Loecker and Warzynski (2012) Sketch of main idea I

I Definition of markup: µ = P/MC

v I Let Pit represent the price of input v and let Pit represent the price of output.

I Production function:

 1 V  Qit = Qit Xit ,..., Xit , Kit , ωit

where v = 1, 2,..., V indexes variable inputs.

I Assumption: variable inputs are set each period to minimize costs.

74 / 84 De Loecker and Warzynski (2012) Sketch of main idea II

I Lagrangian for cost minimization problem:

V  1 V  X v v L Xit ,..., Xit , Kit , λit = Pit Xit + rit Kit + λit (Qit − Qit (·)) v=1

I First-order condition:

v ∂Qit (·) Pit − λit v = 0, ∂Xit

where λit is the marginal cost of production at production level Qit .

75 / 84 De Loecker and Warzynski (2012) Sketch of main idea III

I First-order condition:

v ∂Qit (·) Pit − λit v = 0. ∂Xit

v I Multiplying by Xit /Qit :

v v v ∂Qit (·) Xit 1 Pit Xit v = . ∂Xit Qit λ Qit

I With µit ≡ Pit /λit ,

v v v ∂Qit (·) Xit Pit Xit v = µit ∂Xit Qit Pit Qit

where we have multiplied and divided by Pit on the RHS.

76 / 84 De Loecker and Warzynski (2012) The markup formula

This leads to a simple expression:

v v −1 µit = θit (αit )

v v where θit is the output with respect to input v, and αit is expenditures on input v as a share of revenues.

I On its own, this formula is nothing new

v I What’s new about DLW is how flexible they are about estimating θit and how they base their inferences about markups on careful production function estimation.

77 / 84 De Loecker and Warzynski (2012) The demand-based approach

I Recall the formula for monopoly pricing: p 1 = −1 mc 1 + ED

−1 where ED is the inverse elasticity of demand. I In more complicated settings (e.g., differentiated products), we can still solve for markups as a function of demand elasticities.

I Demand-based approach has been the standard, but notice the many assumptions involved: I Typically static Nash-Bertrand competition (or at least some imperfect competition game where we can easily solve for the equilibrium) I Instruments to identify demand I Functional form assumptions on demand system, model of consumer heterogeneity

78 / 84 De Loecker and Warzynski (2012) CD: example

I Assume labor is a flexible input.

I With Cobb-Douglas production function,

βL βK Qit = exp (ωit ) L K ,

output elasticity of labor is just a constant:

L ∂Qit Lit θit = = βL. ∂Lit Qit

I Markup: βL µit = L αit

79 / 84 De Loecker and Warzynski (2012) CD: concerns

Cobb-Douglas markup: βL µit = L αit Some things we might worry about:

I Bias in estimating βL without appropriate econometric strategy (always a concern in production function estimation)

I Cobb-Douglas is very restrictive, imposing output elasticity which does not depend on Q nor the relative levels of inputs. Variation in expenditure shares will be only source of variation in markups.

I If we assume variation of input share is independent of output elasticity, then any variation in productivity which affects the input share is being treated as variation in markups.

80 / 84 De Loecker and Warzynski (2012) Translog production function

I DLW’s main results are based on a translog production function:

2 2 yit = βl lit + βk kit + βll lit + βkk kit + βlk lit kit + ωit + εit .

I Translog output elasticities: ˆL ˆ ˆ ˆ θit = βl + 2βll lit + βlk kit ,

so translog production is flexible enough to allow for a first-order approximation to how output elasticities vary with input use.

81 / 84 De Loecker and Warzynski (2012) Empirical framework

I Consistent with production function estimation literature, they assume Hicks-neutral productivity shocks:

 1 V  Qit = F Xit ,..., Xit , Kit ; β exp (ωit ) .

I Also allow for some measurement error in production:

yit = ln Qit + εit

yit = f (xit , kit ; β) + ωit + εit

82 / 84 De Loecker and Warzynski (2012) The control function

I Following Levinsohn and Petrin, use materials to proxy for productivity

mit = mt (kit , ωit , zit )

where zit are controls.

I Note: a big claim of the paper is estimating "markups without specifying how firms compete in the product market"

I But here, zit must control for everything which shifts input demand choices or else there will be variation in productivity they’re not controlling for (and hence some of the variation in their inferred markups may actually come from variation in productivity)

I In the appendix, they explain that zit includes input prices, lagged inputs (meant to capture variation in input prices), and exporter status.

83 / 84 De Loecker and Warzynski (2012) Physical output vs. sales

I Note that the theory is developed in terms of outputs, but DLW only have sales (as usual).

I For a price-taking firm, there’s no problem rewriting the formula in terms of sales: v v v v ∂Rit (·) Xit Pt ∂Qit (·) Xit Pit Xit v = v = µit ∂Xit Rit ∂Xit Pt Qit Pit Qit

∂Rit (·) Pt ∂Qit (·) because v = v . ∂Xit ∂Xit I However, if the firm has market power,   ∂Rit (·) ∂Qit (·) ∂Pit v = v Pit + . ∂Xit ∂Xit ∂Qit

84 / 84