Pervasive Stickiness (Expanded Version)

N. Gregory Mankiw and Ricardo Reis∗ Harvard University and Princeton University

January 2006

Abstract

This paper explores a macroeconomic model of the business cycle in which stickiness of information is pervasive. We start from a familiar benchmark classical model and add to it the assumption that there is sticky information on the part of consumers, workers, and firms. We evaluate the model against three key facts that describe short- run fluctuations: the acceleration phenomenon, the smoothness of real wages, and the gradual response of real variables to shocks. We find that pervasive stickiness is required to fit the facts. We conclude that models based on stickiness of information offer the promise of fitting the facts on business cycles while adding only one new plausible ingredient to the classical benchmark.

JEL codes: E30, E10

∗This is an extended version of our paper with the same title published in the American Economic Review, May 2006. It includes a lengthy appendix laying out the model and explaining the algorithm that solves it.

1 This paper explores a macroeconomic model of the business cycle in which stickiness of information is a pervasive feature of the environment. Prices, wages, and consumption are all assumed to be set, to some degree, based on outdated information sets. We show that a model with such pervasive stickiness is better at matching some key facts that describe economic fluctuations than is either a benchmark classical model without such informational frictions or a model with only a subset of these frictions. The benchmark classical model that provides the starting point for this exercise will seem familiar to most readers. Prices are based on marginal cost; wages are based on the marginal rate of substitution between work and leisure; the demand for output is derived from a forward-looking consumption Euler equation; and interest rates are set by the central bank according to a conventional Taylor rule. The economy is buffeted by two kinds of disturbances: shocks to the production function and shocks to monetary policy. To this benchmark model, we add the assumption of sticky information. In Mankiw and Reis (2002) and Reis (forthcoming) we showed that if firms are assumed to set prices based on outdated information sets, certain features of inflation dynamics are more easily explained. In Mankiw and Reis (2003) we found that sticky information on the part of workers could account for some features of the labor market. And Reis (2004) discovered that inattentiveness on the part of consumers helps explain the dynamics of consumption.1 Here we show that pervasive stickiness of this type can simultaneously help explain several features of business-cycle dynamics.

I. Three Key Facts

We focus here on three key facts that describe short-run economic ﬂuctuations. These facts are chosen because we believe they are crucial for any business cycle theory to explain and because they are hard to square with macroeconomic models without any frictions.

Fact 1: The Acceleration Phenomenon. In Mankiw and Reis (2002), we emphasized that inflation tends to rise when the economy is booming and fall when economic activity is depressed. This is the central insight of the empirical literature on the Phillips curve. One simple way to illustrate this fact is to correlate the change in inflation, πt+2 πt 2 with − − 2 output yt detrended with the HP filter. In U.S. quarterly data from 1954:3 to 2005:3, the

1 Xavier Gabaix and David Laibson (2002) and Jonathan A. Parker and Christian Julliard (2005) explored the consequences of inattentiveness for the link between consumption and asset prices. 2 All variables in this paper are in logs and are for the non-farm business sector. Inﬂation is measured

2 correlation is 0.47. That is, the change in inﬂation is procyclical.

Fact 2: The Smoothness of Real Wages. According to the classical theory of the labor market, the real wage equals the marginal product of labor, which, under Cobb-Douglas production, is proportional to the average productivity of labor. In the data, however, real wages do not ﬂuctuate as much as labor productivity. In particular, the standard deviation of the quarterly change in real compensation per hour is only 0.69 of the standard deviation of the change in output per hour. The real wage appears smooth relative to its fundamental determinant.

Fact 3: Gradual Response of Real Variables. Empirical estimates of the dynamic response of economic activity to shocks typically show a hump-shaped response. The full impact of shocks is usually felt only after several quarters. One simple way to demonstrate this fact is to compare the standard deviation of the quarterly change in output, σ(yt yt 1), − − 1 with one-half the standard deviation of the four-quarter change in output, σ(yt yt 4). 2 − − For a random walk, there is no hump-shaped response, and these two measures are equal. In U.S. data, however, the ﬁrst is only 0.79 of the second, indicating that the impact of shocks builds over several quarters.

In summary, here are the three facts we focus on:

trend 1. ρ(πt+2 πt 2,yt yt )=0.47; − − − 2. σ [∆(w p)] /σ [∆(y l)] = 0.69; − − 1 3. σ(yt yt 1)/[ σ(yt yt 4)] = 0.79. − − 2 − − As we will see, a benchmark classical model has trouble ﬁtting each of these facts. We can ﬁx this problem with the assumption of pervasive stickiness of information.

II. The model

Markets and individual behavior. We will use a standard general equilibrium new Key- nesian model with monopolistic competition and no capital accumulation. Because the model is standard, we briefly sketch it here, relegating a detailed exposition to the appendix. There are three types of agents in the economy: firms, consumers, and workers. They meet in markets for labor, goods, and savings. by the change in the log of the implicit price deflatorforthissector,whichisalsousedtocreateallreal variables.

3 The firms in the model have a monopoly over a specific product, for which the demand has a constant price elasticity ν.Eachfirm operates a technology yt,j = at + βnt,j that transforms a composite variable labor input (nt,j) into output (yt,j) under decreasing returns 3 to scale (β (0, 1)) subject to aggregate productivity shocks (at). Productivity follows a ∈ random walk with a standard deviation of innovations of σa. The composite input combines different varieties of labor supplied through a Dixit-Stiglitz aggregator with an elasticity of substitution γ. Within each firm, there are two decision-makers. Thehiringdepartmentisinchargeof purchasing the different varieties of labor so as to minimize costs. The sales department produces the good and sets its price to maximize profits. Although the hiring department acts with perfect information, the sales department faces costs of acquiring, absorbing, and processing information as in Reis (forthcoming), so it only sporadically updates its information. A firm that last updated its information j periods ago, up to a first-order approximation, sets a price:

β(wt pt)+(1 β)yt at pt,j = Et j pt + − − − . − β + ν(1 β) ∙ − ¸

The ﬁrm wishes to set its price (pt,j) relative to the aggregate of prices set by other ﬁrms

(pt) to increase with real marginal costs. Real marginal costs are higher if the real wage

(wt pt) is higher, if production (yt) is larger because of diminishing returns to scale, and − if productivity (at)islower. As in Mankiw and Reis (2002), price setters have sticky information. In each period, a fraction λ of ﬁrms, randomly drawn from the population, obtains new information and recalculates the optimal price.4 The price level, up to a ﬁrst-order approximation, then equals: ∞ j pt = λ (1 λ) pt,j. − j=0 X Consumers are the second set of agents. They maximize expected discounted utility

3 β 1 β You can alternatively think of firmsasoperatingatechnologyYt,j = AtNt,j Kt,j− where Kt,j is a fixed endowment of capital. Since we are abstracting from capital accumulation, this is equivalent to our model with the fixed amount of firm capital normalized to one. 4 Reis (forthcoming) provides a micro-foundation for why firms would choose plans for prices and of the conditions under which, in a population of firms that optimally choose to be inattentive, the arrival of planning dates has an exponential distribution.

4 from consuming every period a Dixit-Stiglitz aggregator of the diﬀerent varieties of goods that the ﬁrms sell. They face an intertemporal budget constraint. The nominal interest rate is it, the real interest rate is rt, and the Fisher equation holds:

rt + Et(∆pt+1)=it.

Consumers also have two decision-makers. One is a shopper who allocates total expenditures over the diﬀerent varieties using full information. This leads to the constant price-elasticity demand for the product of each ﬁrm mentioned earlier. The other decision- maker is a planner who allocates total expenditure over time. She faces costs of information, leading her to stay inattentive; every period a fraction of consumers δ updates their information. Reis (2004) provides a detailed analysis and micro-foundation for this behavior.

A planner that last updated her information j periods earlier chooses expenditure ct,j to satisfy the log-linearized Euler equation:

ct,j = θEt j (rt)+δEt j (ct+1,0)+(1 δ)ct+1,j+1. − − − −

The parameter θ is the elasticity of intertemporal substitution. The consumers diﬀer only with regards to when they last updated their plans. Total consumption, up to a ﬁrst-order approximation, is therefore equal to:

∞ j yt = δ (1 δ) ct,j, − j=0 X where we used market clearing to replace total consumption with aggregate output. Workers are the ﬁnal set of agents. They share a household with consumers and so also care about maximizing expected discounted utility subject to the same intertemporal budget constraint. They choose how much to work and what wage to charge for the particular variety of labor over which they hold a monopoly. The demand for their services comes from the hiring department of ﬁrms and therefore has a constant price elasticity of γ. A worker who last updated her information j periods ago sets a nominal wage according to the Euler equation:

ψwt,j = Et j lt,j ψrt + ψpt + ω [ψ(wt+1,0 pt+1) lt+1,0]+(1 ω)[ψ(wt+1,j+1 pt+1) lt+1,j+1] . − { − − − − − − }

5 The parameter ψ measures the Frisch wage elasticity of labor supply, while ω is the probability that any worker faces of updating her plans at any date. The nominal wage (wt,j)is higher the more labor is supplied (lt,j) and the higher are prices pt. As in Robert E. Lucas Jr. and Leonard A. Rapping (1969), workers intertemporally substitute labor. The higher they expect their wage to be tomorrow, the more willing they are to work then rather than now and so the higher the wage that they demand today. Likewise, if they expect to work more tomorrow, they wish to substitute part of this into work today and thus lower their wage demands. The last component of the intertemporal labor supply is the real interest rate. The higher is rt, the higher are the returns to working today rather than tomorrow. This leads to an increase in the willingness to work today and thus lowers wage demands.5 The wage index equals, up a ﬁrst-order approximation:

∞ j wt = ω (1 ω) wt,j. − j=0 X Finally, the monetary authority follows a Taylor rule:

n it = φ (yt y )+φ ∆pt + εt. y − t π

The parameter φπ is larger than one, respecting the Taylor principle and ensuring a determi- n nate equilibrium for inﬂation. The natural level of output yt denotes the equilibrium level of output if all agents were attentive (that is, if λ = δ = ω =1) so policy responds to the 6 output gap. Finally, εt denotes policy disturbances which follow a ﬁrst-order autoregressive 7 process with parameter ρ and standard deviation of shocks σe.

The reduced form of the model. From the previous equations, one can obtain three equations that capture the equilibrium in the three markets of the model. The ﬁrst equation

5 If both members of a household update their information at the same time, then labor supply has the perhaps more familiar static form: ψwj,t = Et j (lj,t + ψct,j ). However, if workers set their wage plans at − different dates from when consumers set their consumption plans, this condition does not hold. The two members of the household do not agree on the marginal value of an extra unit of wealth. 6 n One can to show that yt =(1+1/ψ)at/(1 + 1/ψ + β/θ β). 7 Our choices regarding inattentiveness were made in an attempt− to avoid some thorny theoretical issues. For example, if shoppers were inattentive, monopolistic firms would be tempted to raise prices to take advantage of their inattentiveness. Separating consumers and firms into attentive and inattentive pieces allows us to make prices, wages, and consumption sticky at the macroeconomic level without inducing such strategic responses at the microeconomic level.

6 is an AS relation or Phillips curve:

∞ j β(wt pt)+(1 β)yt at pt = λ (1 λ) Et j pt + − − − . − − β + ν(1 β) j=0 X ∙ − ¸ Intuitively, the higher are expected prices or marginal costs, the higher will be the price that firms wish to set. In response to an unexpected rise to these variables though, only a share λ of firms will raise their price. The second condition is an IS equation capturing the relationship between spending and financial conditions: ∞ j n yt = δ (1 δ) Et j (yt θRt) − − − j=0 X 8 Rt = Et ( i∞=0 rt+i), the long real interest rate. Higher expected productivity increases spending,P while higher expected interest rates lower spending by encouraging saving. The stickier is information (smaller δ), the smaller is the impact of shocks on spending since fewer consumers are aware of them. The third equation is a labor market clearing equation or wage curve:

n ∞ j γ(wt pt) yt at ψ (yt θRt) wt = ω (1 ω) Et j pt + − + − + − . − − γ + ψ β(γ + ψ) θ(γ + ψ) j=0 X ∙ ¸ Nominal wages increase one-to-one with expected prices because workers care about real not nominal wages. The more labor is used in production, the higher are wages, reﬂecting the standard slope of the labor supply curve. Higher expected productivity leads to higher wages. Finally, higher interest rates imply a larger return on today’s saved earnings thus leading to more willingness to work and lower wage demands. These three equations combined with the Fisher equation and the Taylor rule determine a sticky information equilibrium in (yt,pt,wt,rt,it) given exogenous shocks to (at,εt).The appendix describes an algorithm that computes the equilibrium. We will use a baseline set of parameters. For preferences: θ =1so utility over consumption is logarithmic, ψ =4so labor supply is very wage elastic, and ν =20so the price markup is about 5% consistent with the lower end of the estimates in Susanto Basu and John G. Fernald (1995). For technology, we assume that γ =10so the wage markup is about 11% and that the labor

8 All variables are in deviations from the steady state so limi Et[rt+i]=0and the long rate is ﬁnite. →∞ See the appendix for more details.

7 share of income β =2/3. The Taylor rule parameters are taken from Glenn D. Rudebusch

(2002): φy =0.33, φπ =1.24, ρ =0.92 and σe =0.0036. Finally, based on U.S. quarterly data, we set σa =0.0085. We have experimented with alternative reasonable parameter values and obtained similar conclusions, but we do not report these experiments here due to space constraints.

III. The Need for Pervasive Stickiness

The classical benchmark. We start with the classical model in which there is no stickiness of information. In this fully attentive economy, the classical dichotomy holds, and output is always at its natural level. Because there is no output gap, the model offers no obvious way of explaining fact 1, the acceleration phenomenon. In this classical benchmark, output (which is driven solely by productivity shocks) and inflation(whichisdrivensolelyby monetary policy shocks) are independent. The model also cannot explain fact 2, the smoothness of real wages: without any rigidities, real wage growth exactly equals productivity growth. Finally, output is proportional to productivity (see footnote 6). Thus it follows a random walk, contradicting fact 3. We therefore conclude that this frictionless economy cannot fit any of the three facts.

Single sources of stickiness. Imagine now that only firms are inattentive, updating their information on average once a year (λ =1/4). The model can now generate an acceleration correlation of 0.56, moving in the direction of fitting fact 1. But σ [∆(w p)] /σ [∆(y l)] = − − 1 1.54 and σ(yt yt 1)/[ σ(yt yt 4)] = 1.03, so the model moves in the wrong direction − − 2 − − when it comes to fitting the other two facts. Alternatively, suppose there is sticky information only in the labor market with 25% of workers updating their plans every period (ω =0.25). The model again moves in the right direction with regards to the acceleration phenomenon, predicting a correlation between changes in inflation and the output gap of 0.10. However, real wages are exactly as volatile as labor productivity, and output adjusts quickly to shocks (the ratio of standard deviations is 1.17). The result concerning real wages can be derived from the Phillips curve: if goods prices are set with full attention, real wages always equal output per hour. The last case is that of only inattentive consumers (δ =0.25). This model fails to match fact 1 (the correlation between inflation and the output gap is almost exactly zero) and fact 2 (real wages are just as volatile as labor productivity). Sticky information on the part of

8 consumers helps move the model closer to the data with regards to the sluggishness of real 1 variables. The ratio σ(yt yt 1)/[ σ(yt yt 4)] is 0.65, much closer to fact 3 on U.S. data. − − 2 − − Two sources of stickiness. What if two of the three sets of agents in the economy are inattentive, but the remaining are attentive? Again, the model cannot ﬁtthefacts.If producers are attentive, then real wages and output per hour are proportional, failing to match fact 2 concerning the smoothness of real wages. If instead workers are the only agents without sticky information, then σ [∆(w p)] /σ [∆(y l)] = 1.68. In this case, real wages − − are more volatile than productivity, again failing to match fact 2. Finally, if consumers 1 are the only attentive agents, then σ(yt yt 1)/[ σ(yt yt 4)] = 1.03. The model with − − 2 − − attentive consumers cannot generate fact 3, the gradual response of real output.

Pervasive stickiness. The previous cases showed that with either no stickiness or selec- tive stickiness, one cannot fit all three business cycle facts. Pervasive stickiness is necessary. We now ask whether pervasive stickiness is itself enough to account for the facts. We start with the case where firms, consumers, and workers, are all inattentive with λ = δ = ω = n 0.25. In this economy, ρ(πt+2 πt 2,yt yt )=0.63, σ(∆(w p))/σ(∆(y l)) = 0.29 and − − − − − 1 σ(yt yt 1)/[ σ(yt yt 4)] = 0.69. Pervasive stickiness moves the baseline classical model − − 2 − − in the right direction across all three dimensions. Changes in inflation are now positively correlated with real activity, wages are smoother than productivity, and output adjusts gradually to shocks. These results come from somewhat arbitrarily setting the degree of information stickiness to 0.25 for all sectors of the economy. We have searched for the values of the inattentiveness parameters λ, ω, and δ that move the model closest to fitting the three facts, in the sense of minimizing the sum of squared deviations of the model’s predicted moments and their empirical counterparts. Formally, this is akin to the method of simulated moments with a GMM weighting matrix that gives each moment the same weight. The resulting estimates are λ =0.52,ω=0.66, and δ =0.36. In this best-fitting case, firms setting prices update their information on average about every 6 months, workers setting wages update about every 4.5 months, and consumers update about every 9 months. Despite this mild amount of inattentiveness and the model’s simplicity, it fits the facts remarkably well: its predicted moments are within less than 0.06 of the three facts. Using the same estimation method but assuming all agents update their plans with the same frequency leads to an estimated

9 probability of adjustment of 0.57, indicating that agents update their information on average n every 5 months. In this case, ρ(πt+2 πt 2,yt yt )=0.43, σ [∆(w p)] /σ [∆(y l)] = 0.56 − − − − − 1 and σ(yt yt 1)/[ σ(yt yt 4)] = 0.89. Introducing this one free parameter moves the model − − 2 − − signiﬁcantly in the direction of explaining all three facts

IV. Conclusion

Many modern models of business cycles start from a classical benchmark similar to the one in this paper. Over the past two decades, however, researchers have found that this model has several shortcomings and have proposed remedies. Because monetary policy seems to have real effects, research has recently focused on a hybrid formulation of Calvo’s sticky price model in which either some price-setters are naive or all index their prices to past inflation. Because real wages are smooth in the data, research has looked into models with adjustment costs in using inputs, norms in labor bargaining, or direct real wage rigidities. Because consumption and output growth are positively serially correlated, research has considered modelling representative agents that form habits. In a prescient article, Christopher A. Sims (1998) noted that across all dimensions, to match the data, the classical model needed “stickiness.” It has become increasingly clear that stickiness is not just needed but must also be pervasive. Fixing the classical model with a series of isolated patches, however, runs the risk of losing the discipline of having a model altogether. Inattentiveness and stickiness of information have the virtue of adding only one new plausible ingredient to the classical benchmark. The results reported here suggest that such a model moves promisingly in the direction of fitting the facts on business cycles.

10 Appendix

This appendix contains a description of the model used in the paper and the algorithm that solves it.

A.I. The economic environment

Households. There is a continuum of households distributed in the unit interval and indexed by j. They live forever discounting future utility by a factor ξ (0, 1) and obtaining ∈ utilityeachperiodaccordingto:

1 1/θ 1+1/ψ Ct,j− 1 κLt,j U(Ct,j,Lt,j)= − , (1) 1 1/θ − 1+1/ψ − where: θ is the intertemporal elasticity of substitution, ψ is the Frisch elasticity of labor supply, κ captures relative preferences for consuming goods or leisure, Ct,j is the consumption by household j at date t,andLt,j is the labor supplied by household j at date t.

Consumption Ct,j is a Dixit-Stiglitz aggregator of the consumption of varieties indexed by i, Ct,j(i), with an elasticity of substitution ν:

ν 1 − 1 ν ν ν 1 Ct,j = Ct,j(i) − di (2) µZ0 ¶ At each date t, the household faces a budget constraint:

PtCt,j + Bt,j = Wt,jLt,j +(1+it 1)Bt 1,j + Tt,j. (3) − −

The new notation stands for: Pt is the dollar price of goods at date t, Bt,j are holdings of nominal bonds, Wt,j is the nominal wage paid to household j,andit 1 is the nominal − net return at t on a bond purchased at t 1.FinallyTt,j are lump-sum nominal transfers − received by the the household from two sources. First, they come from receiving profits from firms, which are equally owned by all households. Second, we assume that consumers signed an insurance contract at the beginning of time so that they all start each period with with the same wealth. This way, we do do not have to track the wealth distribution. The payments from this contract are in Tt,j. The Dixit-Stiglitz aggregator has an associated static price index: 1 1 1 ν 1 ν − Pt = Pt(i) − di . (4) µZ0 ¶ 11 Technologies. Households own a continuum of firms in the unit interval indexed by j.

Firm j operates a technology that combines the labor supplied by each household i, Nt,j(i), into the output of a particular variety of good Yt,j. The production function is:

β Yt,j = AtNt,j, (5) γ 1 − 1 γ γ γ 1 Nt,j = Nt,j(i) − di . (6) µZ0 ¶

At stands for exogenous aggregate productivity, which follows a random walk in logs with standard deviation of shocks σa. The parameter β (0, 1) is the labor share of income ∈ and measures the degree of diminishing returns to scale. The composite of inputs used by

ﬁrm j, Nt,j, is a Dixit-Stiglitz aggregator of the diﬀerent varieties of labor hired with an elasticity of substitution γ. It implies a dual minimum-expenditure static wage index:

1 1 1 γ 1 γ − Wt = Wt(i) − di . (7) µZ0 ¶ Markets. In ﬁnancial assets, there is an anonymous market for nominal bonds. They are in zero net supply so for the market to clear:

1 Bt,jdj =0, (8) Z0 There is a goods market for each variety i, in which all consumers are buyers and the sole seller is ﬁrm i that has a monopoly over its variety. Market clearing requires:

1 Ct,j(i)dj = Yt,i. (9) Z0 Across all varieties, total output is:

ν 1 1 ν −ν ν 1 Yt = Yt,i− di . (10) µZ0 ¶ Finally, there is a labor market for each variety of labor i. The buyers are all ﬁrms and the seller is the household that has the monopoly over labor services i. In equilibrium:

1 Nt,j(i)dj = Lt,i. (11) Z0

12 Total labor is: γ 1 1 γ −γ γ 1 Lt = Lt,i− di . (12) µZ0 ¶ Decision-makers and information. Consumers wish to maximize the expected discounted sum of utility at each date (1) given the preferences in (2) subject to the sequence of budget constraints (3) from t into inﬁnity and a no-Ponzi scheme condition. There are two decision-makers within a consumer, who cannot exchange information. One is a shopper whose job is to pick at each date the consumption of each variety taking total expenditure at that date as given. The shopper has full information and searches the markets for all varieties for the best bundle of goods at lower cost. The other decision-maker is a planner, whose job is to choose the total amount of expenditure at each date and how much to save. The planners are inattentive only sporadically updating their information. When a planner updates her information, she obtains full information, but in between updates she obtains no new information. Reis (2004) presents a model in which costs of acquiring, absorbing, and processing information lead planners to optimally choose how often to update their information. Here, we take this behavior as given. Moreover, following Mankiw and Reis (2002), we assume that there is sticky information, understood as a constant probability δ at each date that any planner receives new information. Our assumptions imply that planners diﬀer only with regards to their information. To continue using j to index planners, we change its meaning. From now on, j denotes how long ago a planner last updated her information so Ct,j is the expenditure of a planner who last updated her information j periods ago. Thus, the unit mass of planners is divided into a countable number of groups of consumers, each with mass δ(1 δ)j. − A second set of agents includes workers. They have the same objective and face the same constraints as consumers–they share a household. Their choice is what wage to charge for their labor services. They post a wage monopolistically taking into account the demand for their labor and commit to supplying the labor necessary to ensure that the market clears at that wage. They are also inattentive with sticky information, as in Mankiw and Reis (2003).

The wage Wt,j is set by a worker that last updated her information j periods ago and sticky information implies that at every date a fraction ω of workers update their information. The final set of private agents are firms. Within the firm, there are two departments making decisions. The hiring department takes as given the choice of how much to produce

13 and hires the combination of labor inputs that minimizes costs using full information. The sales department sets a price that takes into account its monopoly power and the demand for its product and commits to producing as much of the good as necessary to clear the market. They are inattentive as modelled explicitly in Reis (2005), who provides also a set of conditions under which information is sticky as in Mankiw and Reis (2002). Each date, a fraction λ of sales departments in ﬁrms obtain new information.

Monetary Policy. In this cashless economy, we assume that the government can enforce the use of a unit of account and issue nominal bonds. This gives it the power to set the nominal interest rate. (See Woodford, 2003, for an exposition of how interest rates are set in cashless economies.) We assume that policy mechanically follows a Taylor rule:

Yt Pt it = φy log n + φπ log + εt. (13) Yt Pt 1 µ ¶ µ − ¶ n Yt will be defined later and εt are policy shocks that follow the process: εt = ρεt 1 + et − where et is white noise with mean zero and standard deviation σe. The parameter φπ is larger than one, respecting the Taylor principle, while the parameter φy is non-negative reflecting a desire for stabilization. Finally, note that the interest rate rule does not ensure determinacy of the price level. This indeterminacy is well-known and there are many slight modifications of the model that eliminate it (see, for instance, Woodford, 2003, chapter 2). We do not wish to complicate the model further by addressing this issue directly. Instead, we peg the initial price level at an initial condition: P 1 =1. − A.II. Equilibrium of the economy

To solve the model, we must ﬁrst describe optimal behavior. We start with consumers and their two choices. Optimal behavior by shoppers implies that the demand for each variety by consumer j: ν Ct,j(i)=Ct,j (Pt(i)/Pt)− . (14)

Summing over all consumers and using the market clearing condition for variety i in (9) implies: 1 ν Yt,i =(Pt(i)/Pt)− Ct,jdj . (15) µZ0 ¶ Moving next to planners, recall that they obtainnewinformationwithprobabilityδ every period. Recall also that all planners are identical aside from how long they last

14 planned j. Letting At,j [(1 + it 1)Bt 1,j + Wt,jLt,j + Tt,j]/Pt denote the real resources ≡ − − with which planner j enters period t, the assumption of perfect insurance implies that

At,j = At, the same for all planners. We denote by V (At,.) the value function for planners that plan at period t. The second argument in the value function includes other state variables that may be useful at forecasting the future–we will omit it from now onwards. The planner solves the dynamic program:

∞ i i ∞ i i V (At)= max ξ (1 δ) U(Ct+i,i,.)+ξδ ξ (1 δ) Et [V (At+1+i)] , (16) Ct+i,i − − (i=0 i=0 ) { } X X Wt+1+i,.Lt+1+i,. + Tt+1+i,. s.t. : At+1+i = Rt+1+i (At+i Ct+i,.)+ . (17) − Pt+1

The ﬁrst term in the Bellman equation equals the expected discounted utility if the planner never updates her information again. The second term includes the sum of the continuation values over all of the possible future dates at which the agent may plan again, each occurring with a probability δ(1 δ)i. The constraint comes from re-writing of the budget constraint, − where

Rt+1 (1 + it) Pt/Pt+1, (18) ≡ the real return on bonds. Note that the consumer takes work choices as given. The ﬁrst-order conditions for optimality are:

i i 1/θ ∞ k k ξ (1 δ) C− = ξδ ξ (1 δ) Et V 0 (At+1+k) R¯t+i,t+1+k , (19) − t+i,i − k=i X £ ¤

t+k for all i =0, 1,...,whereR¯t+i,t+1+k = Rz+1, the compound return between two dates. z=t+i The envelope theorem condition is: Q

∞ k k V 0 (At)=ξδ ξ (1 δ) Et V 0 (At+1+k) R¯t,t+1+k . (20) − k=0 X £ ¤ 1/θ Combining condition (19) at i =0and (20) shows that V 0 (At)=Ct,−0 .Themarginal value of an extra unit of resources equals the marginal utility of using them immediately for consumption. Using this result to replace for the marginal value terms in the optimality

15 conditions, they simplify to:

1/θ 1/θ Ct,−0 = ξEt Rt+1Ct−+1,0 , (21) 1/θ h 1/θ i Ct−+j,j = Et Ct−+j,0 , (22) h i holding for all t and all j.Thefirst condition is the usual Euler equation for an attentive consumer. The second condition shows that an inattentive consumer sets the marginal utility of her consumption equal to her expectation of the marginal utility of the attentive consumer. Moving next to firms, the hiring department minimizes costs with full information. The optimal demand by firm j for labor services of variety i is:

γ Nt,j(i)=Nt,j (Wt(i)/Wt)− . (23)

Summing over all ﬁrms and using the market clearing condition in labor variety i, we obtain:

1 γ Lt(i)=(Wt(i)/Wt)− Nt,jdj . (24) µZ0 ¶ The sales department maximizes profits subject to the production function (5) and the iso-elastic demand for its product in (15). For a firm that last updated its information j periods ago, the first-order condition is:

1/β 1/β Et j WtYt,j At− ν − Pt,j = . (25) ν 1 × E³ t j (βYt,j) ´ − − This states the usual result that with iso-elastic preferences, nominal prices are a ﬁxed markup over nominal marginal costs. The markup equals ν/(ν 1). The other fraction in − the expression, for a ﬁrm that is planning, equals the nominal marginal cost–the nominal wage divided by the marginal product of the composite labor input. Finally, we move to workers. Their problem is similar to that faced by the consumption

16 planners. Jumping to the optimality conditions:

1/ψ γ PtκLt,j Wt,0 = , (26) γ 1 × Vˆ (.) − t0 1/ψ 1/ψ Lt,0 Pt ξRt+1Lt+1,0Pt+1 = Et , (27) Wt,0 Ã Wt+1,0 ! 1+1/ψ Et Lt+j,j W = . (28) t+j,j 1/ψ³ ´ Et Lt+j,0Lt+j,j/Wt+j,0 ³ ´ The first condition states that wages equal a constant markup γ/(γ 1) over the marginal − opportunity cost of labor. For the worker that is planning, this equals the marginal disutility 1/ψ ˆ of labor κLt,j divided by the marginal utility of an extra dollar for the worker Vt0.The second equation is a standard Euler equation for an attentive worker. Supplying an extra 1/ψ unit of labor today leads to a fall in utility of Lt,0 . In return, the worker receives Wt,0/Pt, which after invested in bonds returns Rt+1 per unit the next period. The worker can then work less (Rt+1Wt,0/Pt)Pt+1/Wt+1,0 units tomorrow which raise utility by this amount times 1/ψ the marginal utility of labor tomorrow Lt+1,0 discounted by the factor ξ. At an optimum, the cost of anticipating work must equal its expected benefit, so the equality must hold. Finally, the third condition states that an inattentive worker sets wages so that her expected disutility from working mirrors the expected disutility from working of an attentive worker. We can now define a competitive equilibrium of this economy: it is an allocation of total expenditures and savings, consumption of varieties, labor supplied of the different varieties, and output produced of each variety such that consumers, workers and firms all behave optimally, monetary policy follows the Taylor rule, and all markets clear. Theequationsabovecharacterizethis equilibrium. However, they are difficult to handle.

We proceed by log-linearizing around the stationary point where σa = σe =0so all variables are constant. Small letters denote the log-linear deviation of the respective capital variables from the steady state, with the exception of rt, which denotes the log-linear deviation of

17 Et[Rt+1]. The set of log-linearized optimality conditions is:

rt = it Et (∆pt+1) , (29) − yt,j = yt ν (pt,j pt) , (30) − − lt,j = lt γ(wt,j wt), (31) − − ct,j = Et j (ct+1,0 θrt) , (32) − − pt,j = Et j [wt +(1/β 1)yt,j at/β] , (33) − − − wt,j = Et j[pt + lt,j/ψ rt + wt+1,0 pt+1 lt+1,0/ψ]. (34) − − − −

The log-linearized deﬁnitions of the aggregate production function and the price, wage, and output indices are:

yt = at + βlt (35)

∞ j pt = λ (1 λ) pt,j, (36) − j=0 X ∞ j wt = ω (1 ω) wt,j (37) − j=0 X ∞ j yt = δ (1 δ) ct,j, (38) − j=0 X Finally the log-linear Taylor rule is:

n it = φ (yt y )+φ ∆pt + εt. (39) y − t π This set of 11 equations over time provide the competitive equilibrium solution for the set of 11 variables (yt,yt,j,ct,j,lt,j,lt,wt,wt,j,pt,pt,j,it,rt) as a function of the exogenous n processes at, εt and yt .

A.III. Reduced-form representations of the model

Thefullyattentiveeconomy. In the classical economy, λ = δ = ω =1so all are attentive. Following convention, we label the equilibrium in this economy “natural.” Equations (36)- n n n n n n n n (38) imply that pt,j = pt , wt,j = wt and ct,j = yt , while (30)-(31) imply that lt,j = lt n n and yt,j = yt .Thisreﬂects the fact that all agents are identical. A few steps of algebra n n n n n n n using equations (32)-(35) shows that: l =(y at)/β, w p = y l , r =0and t t − t − t t − t t

18 n that y = Ξat,whereΞ =(1+1/ψ)/(1 + 1/ψ + β/θ β). Note that all real variables t − are determined as a function of only the exogenous technology shock at, independently of monetary policy. The classical dichotomy holds in this economy. Monetary policy shocks determine the nominal interest rate and inﬂation through the Taylor rule and the Fisher n n n n equation. The solutions are: ∆p = εt/(φ ρ) and i = r + Et ∆p . t − π − t t t+1 We have solved for the natural equilibrium. Note that for preferences¡ consistent¢ with a balanced growth path (θ =1) the parameter Ξ =1. Therefore, output and real wages are proportional to productivity and labor supplied is constant. Since output is always at its natural level that follows a stochastic trend, there is no output gap and so the acceleration correlation is zero. Moreover, as at and εt are independent, output and inﬂation in the attentive economy are statistically independent. Since real wages are proportional to output per hour, the two are equally variable. And since output follows a random walk, the standard deviation of its quarterly changes equals one-half of the standard deviation of its annual changes.

The reduced-form sticky information economy. Combining equations (30) and (33) to replace for pt,j and yt,j in equation (36) gives the aggregate supply relation:

∞ j β(wt pt)+(1 β)yt at pt = λ (1 λ) Et j pt + − − − . (40) − − β + ν(1 β) j=0 X ∙ − ¸

Denoting by mct (real marginal costs) the fraction on the right-hand side, we can re-arrange this equation to obtain a sticky-information Phillips curve:

λmct ∞ j ∆pt = + λ (1 λ) Et 1 j (∆pt + ∆mct) . (41) 1 λ − − − j=0 − X Iterating forward on equation (32), we get:

T ct,j = θ Et j (rt+i)+Et j (ct+T +1,0) . (42) − − − i=0 X Next, take the limit as T . As time elapses to inﬁnity all become aware of past news →∞ n so limi Et (rt+i)=limi Et rt+i =0. Moreover, since the probability of remain- →∞ →∞ ing inattentive falls exponentially¡ with¢ the length of the horizon, we approach this limit fast enough to ensure that the sum in the ﬁrst term converges. As for the second term,

19 n n limi Et (ct+i,0)=limi Et yt+i = yt .Theﬁrst equality holds because consumers →∞ →∞ are fully insured every period and£ in¤ the limit all are informed. The second equality holds n because yt follows a random walk. The expression above therefore becomes:

n ct,j = θEt j (Rt)+yt j, (43) − − − where Rt = Et ( i∞=0 rt+i), the long real interest rate. Replacing for ct,j in (38) gives the IS curve: P ∞ j n yt = δ (1 δ) Et j (yt θRt) . (44) − − − j=0 X We can ﬁrst-diﬀerence this equation to obtain an alternative representation of the IS:

2 ∞ j n n yt = Et (yt+1) δ (1 δ) [yt Et j (yt )] − − − − j=0 X ∞ j θδ(rt Rt) θδ (1 δ) Et j [(1 δ)rt + δRt] . − − − − − − j=0 X Similar steps, iterating forward on (34) and using the fact that ψ(wn pn) ln = ψyn/θ t − t − t t show that: n ψwt,j = Et j(lt,j + ψpt Rt)+ψyt j/θ. (45) − − −

Using this result as well as (33) and (35) to replace for lt,j, wt,j and lt in (37), gives a wage curve:

n ∞ j γ(wt pt) yt at ψ (yt θRt) wt = ω (1 ω) Et j pt + − + − + − . (46) − − γ + ψ β(γ + ψ) θ(γ + ψ) j=0 X ∙ ¸ The AS, the IS and the wage curve, together with the Fisher equation and the Taylor rule characterize the equilibrium for (yt,pt, wt, rt, it) given exogenous shocks to (at,εt) in the sticky information economy.

A.IV. Properties of the sticky information equilibrium

Finding the sticky information equilibrium. We ﬁnd the sticky information equilibrium

20 using a method of undetermined coeﬃcients. That is we guess that:

∞ yt = [ˆynet n +˜yn(at n at 1 n)] , (47) − − − − − n=0 X ∞ pt = [ˆpnet n +˜pn(at n at 1 n)] , (48) − − − − − n=0 X ∞ wt = [ˆwnet n +˜wn(at n at 1 n)] , (49) − − − − − n=0 X ∞ rt = [ˆrnet n +˜rn(at n at 1 n)] , (50) − − − − − n=0 X ∞ it = [ˆınet n +˜ın(at n at 1 n)] , (51) − − − − − n=0 X andlookforthecoeﬃcients with hats and tildes. TheASinequation(40)impliesthat:

β + ν(1 β) − ν(1 β) pˆn =(1 β)ˆyn + βwˆn, (52) Λ − − − ∙ n ¸ β + ν(1 β) − ν(1 β) p˜n =(1 β)˜yn + βw˜n 1, (53) Λ − − − − ∙ n ¸

n i for all n,whereΛn = λ (1 λ) ,theshareofﬁrms that have learned about a shock n i=0 − periods after it occurs.P The IS in equation (44) implies that:

yˆn = ∆nθRˆn, (54) − yñ = ∆n Ξ θRñ , (55) − ³ ´ n i for all n,where∆n = δ (1 δ) , the share of consumers that have learned about a i=0 − ˆ shock n periods after it occurs.P The definition of the long rate implies that Rn = i∞=0 rˆn+i ˜ and Rn = i∞=0 rñ+i. The wage curve in (46) in turn implies that: P P (γ + ψ Ωnγ)ˆwn = Ωnψpˆn + Ωn yˆn/β ψRˆn , (56) − − ³ ´ (γ + ψ Ωnγ)˜wn = Ωnψpñ + Ωn (ˆyn 1)/β + ψ(Ξ θRñ)/θ , (57) − − − h i n i for all n with Ωn = ω (1 ω) . i=0 − ˆ Using (52) to replaceP for wˆn and (54) to replace for Rn in (56), and first-differencing

21 (54) allows us to drop wages and the long rate and reduce the problem to ﬁnding prices, output and the short rate with two equations:

yˆn = Ψnpˆn, (58) yˆn+1 yˆn θrˆn = . (59) ∆n+1 − ∆n where we deﬁned a new parameter:

β+ν(1 β) θ∆n [ψ + γ(1 Ωn)] − ν(1 β) Ωnβψ − Λn − − − Ψn = (60) (1 nβ)(γ + ψ)θ∆n +hΩn θ∆n [1 γ(1 βi)] + ψβ o − { − − } For the coeﬃcients involving productivity shocks, we have instead:

yñ = Ψnpñ + Υn, (61) yñ+1 yñ θrñ = . (62) ∆n+1 − ∆n where the new parameter is:

θ∆n [γ + ψ + Ωn(1 γ)] Υn = − . (63) (1 β)(γ + ψ)θ∆n + Ωn θ∆n [1 γ(1 β)] + ψβ − { − − } Finally, using the Fisher equation to substitute nominal interest rates out of the Taylor rule and rearranging leads to:

n+1 φ pˆn = φ yˆn+1 +(1+φ )ˆpn+1 pˆn+2 rˆn+1 + ρ , (64) π y π − − φ pñ = φ yñ+1 +(1+φ )˜pn+1 pñ+2 rñ+1 φ Ξ, (65) π y π − − − y for n =0, 1, 2... There is also an initial condition from the Taylor rule at date 0:

φ yˆ0 +(1+φ )ˆp0 pˆ1 rˆ0 +1 = 0, (66) y π − − φ y˜0 +(1+φ )˜p0 p˜1 r˜0 = φ Ξ. (67) y π − − y

We have all the conditions we need to solve for the undetermined coeﬃcients on the impact of monetary and productivity shocks. Our algorithm that ﬁnds the impact of monetary shocks starts by choosing a very large number T and setting yˆn =ˆrn =0and pˆn =¯p

22 for n T . We know that for T =+ this guess is correct for some unknown positive ≥ ∞ value of p¯. Starting with a guess for p¯, the system made by equations (58), (59) and (64) recursively gives the solution for yˆn, rˆn,andpˆn for n = T 1,T 2,...0.Theﬁnal step − − is to check the initial condition (66). One can then search for the p¯ that ensures that (66) holds, which concludes the algorithm.

For productivity shocks, the algorithm is similar. For large T , yñ = Ξ, rñ =0and pñ =¯p for n T , and the system of equations (61), (62) and (65) recursively gives the ≥ solution for yñ, rñ,andpñ for n = T 1,T 2,...,0, while (67) is the initial condition used − − to pin down p¯.

Finally, (52) and (53) give the solution for wˆn and w˜n. Using the production function: yˆt ˆlt =(1 1/β)ˆyt and y˜t ˜lt =(1 1/β)˜yt +1/β.Notealsothat − − − −

∞ πt = [ˆπnet n +˜πn(at n at 1 n)] , (68) − − − − − n=0 X with πˆ0 =ˆp0 and πˆn =ˆpn pˆn 1 for n 1, and the same for π˜n. − − ≥ Calculating the predicted moments. We have characterized the representation of the economy in (47)-(51) and for yt lt in the previous paragraph. The population moment: − n ρ(πt+2 πt 2,yt yt )= − − −

2 2 n∞=0 (ˆπn+2 πˆn 2)ˆynσe + n∞=0 (˜πn+2 π˜n 2)(˜yn Ξ)σa − − − − − , 2 2 2 2 2 2 2 ∞ (ˆπPn+4 πˆn) σ + ∞ (˜πn+4P π˜n) σ [ ∞ yˆ σ + ∞ (˜yn Ξ)σ ] n=0 − e n=0 − a n=0 n e n=0 − a r hP P i P P (69) where πˆ i =˜π i =0for any positive i. Turning next to fact 2: − −

2 2 2 2 σ [∆(w p)] n∞=0 [ˆwn+1 pˆn+1 (ˆwn pˆn)] σe + n∞=0 [˜wn+1 pñ+1 (˜wn pñ)] σa − = − − − 2 − − − 2 . σ [∆(y l)] v ˆ ˆ 2 ˜ ˜ 2 − uP ∞ yˆn+1 ln+1 (ˆyn ln) σe + P∞ yñ+1 ln+1 (˜yn ln) σa u n=0 − − − n=0 − − − u t P h i P h (70)i Finally, to assess fact 3 we can calculate:

2 2 2 2 2σ(yt yt 1) n∞=0 (ˆyn+1 yˆn) σe + n∞=0 (˜yn+1 y˜n) σa − − =2 − − . (71) 2 2 2 2 σ(yt yt 4) s ∞ (ˆyn+4 yˆn) σ + ∞ (˜yn+4 y˜n) σ − − Pn=0 − e Pn=0 − a P P

23 Fitting the model to the data. We solve the problem:

2 2 n 2 σ(∆(w p)) 2σ(yt yt 1) min (ρ(πt+2 πt 2,yt yt ) 0.47) + − 0.69 + − − 0.79 , λ,δ,ω − − − − σ(∆(y l)) − σ(yt yt 4) − ( µ − ¶ µ − − ¶ ) (72) using the formulae in (69)-(71) and the coeﬃcients provided by the algorithm above as a function of the parameters. Since we know little about the function that we are minimizing, we tried diﬀerent non-linear minimization algorithms to search for the optimal λ, δ, ω in { } the space [0, 1]3.

24 References

Basu, Susanto and John G. Fernald “Aggregate Productivity and the Productivity of Aggregates.” National Bureau of Economic Research, Inc., NBER Working Papers: No. 5382, 1995. Gabaix, Xavier and David Laibson. “The 6D Bias and the Equity Premium Puz- zle,” in Ben Bernanke and Kenneth Rogoﬀ,eds.,NBER Macroeconomics Annual 2001. Cambridge: MIT Press, 2002, pp. 257-312. Lucas, Robert E. Jr. and Leonard A. Rapping. “Real Wages, Employment, and Inﬂation.” Journal of Political Economy, 1969, 77, pp. 721-754. Mankiw, N. Gregory and Ricardo Reis. “Sticky Information versus Sticky Prices: A Proposal to Replace the New Keynesian Phillips Curve.” Quarterly Journal of Economics, 2002, 117 (4), pp. 1295-1328. Mankiw, N. Gregory and Ricardo Reis. “Sticky Information: A Model of Mone- tary Non-neutrality and Structural Slumps,” in Philippe Aghion, Roman Frydman, Joseph E. Stiglitz and Michael Woodford, eds., Knowledge, Information, and Expectations in Mod- ern Macroeconomics: In Honor of Edmund S. Phelps, Princeton: Princeton University Press, 2003. Parker, Jonathan A. and Christian Julliard. “Consumption Risk and the Cross- section of Expected Returns.” Journal of Political Economy, 2005, 113 (1), pp. 185-222. Reis, Ricardo. “Inattentive Consumers.” National Bureau of Economic Research, Inc., NBER Working Papers: No. 10883, 2004. Reis, Ricardo. “Inattentive Producers,” Review of Economic Studies (forthcoming). Rudebusch, Glenn D. “Term Structure Evidence on Interest Rate Smoothing and Monetary Policy Inertia.” Journal of Monetary Economics, 2002, 49 (6), pp. 1161-1187. Sims, Christopher A. “Stickiness,” Carnegie Rochester Conference Series on Public Policy, 1998, 49 (1), pp. 317-356. Woo dford, Michael. Interest and Prices, Princeton: Princeton University Press, 2003.