<<

A Semi-structural Methodology for Policy Counterfactuals∗

Martin Beraja† Massachusetts Institute of Technology and NBER

November 26, 2020

Abstract I propose a methodology for constructing counterfactuals with respect to changes in policy rules which does not require fully specifying a structural model, yet is not subject to Lucas Critique. It applies to a class of dynamic stochastic models whose equilibria are well approx- imated by a linear representation. It rests on the insight that many such models satisfy a principle of counterfactual equivalence: they are observationally equivalent under a bench- mark policy rule and yield an identical counterfactual equilibrium under an alternative one. As an application, I use the methodology to quantify how US fiscal integration contributes to regional stabilization.

∗I am indebted to my advisors Fernando Alvarez, Erik Hurst, and Veronica Guerrieri, for their encouragement, support, and guidance during my stay in Chicago, for our many insightful conversations, and for their patience and skill in distilling the important points from my presentations. For their valuable suggestions, I owe George- Marios Angeletos, Andy Atkeson, Philip Haile, Ricardo Lagos, John Leahy, Martin Schneider, Gianluca Violante, and Ivan Werning as well as my classmates Jonathan Adams, Philip Barret, and Chanont Banternghansa. I thank those at the Review of Economic Studies Tour, Minneapolis Fed Junior Scholar Conference, SED, “Pizzanomics,” NBER SI Dynamic Equilibrium Models, and Macro Seminars at MIT, Harvard, Yale, Wharton, UCLA, GWU, Stanford, Princeton, NYU, Michigan, Northwestern, Cornell, Brown, LSE, BU, Berkeley, Fed Board, Atlanta Fed, Chicago Fed, BCRA, and Norges Bank. I thank all participants of the Capital Theory Working Group, especially Harald Uhlig and Nancy Stokey. Finally, I thank Juliette Caminade for her support and comments on preliminary versions of this paper. †E-mail: [email protected]. Web: .mit.edu/faculty/maberaja 1 Introduction

Economists commonly follow either structural or reduced-form (and semi-structural) approaches in order to answer counterfactual questions. The former relies on specifying primitives of a par- ticular model and identifying its parameters. This is a daunting task whenever researchers are uncertain about features of alternative models that are both difficult to distinguish with avail- able data and reasonable a-priori. If counterfactuals differ under these alternative models, their credibility is undermined because they lack robustness. The latter approaches require less com- mitment to particular model assumptions and have proven very useful to identify consequences of observed policy changes that are robust across models, as in Sims (1980). However, the struc- tural approach is the leading paradigm for studying unobserved, potential changes in policy rules because structural counterfactuals are not subject to the critique in Lucas (1976). Are there then reasonable circumstances when we can analyze the consequences of a counterfactual policy rule change without knowing the full structure of a model? And, relatedly, can we tell which features of disparate models crucially determine such consequences? In this paper, I propose a novel semi-structural method for constructing counterfactuals with respect to policy rule changes. It is useful in circumstances when a linearization approach is reasonable and we do not wish to fully specify a structural model, yet we are worried about using approaches that contain too little information for the analysis to be immune to Lucas Critique — like those based on structural VARs.1 The method hinges on an insight about dynamic stochastic models with equilibria that are well approximated by a linear, stable recursive representation. The insight is that many models that can match an economy’s recursive equilibrium under a benchmark policy rule—i.e., observationally equivalent models—will also generate an identical counterfactual equilibrium under an alternative policy rule. Thus, regardless of their specific microfoundations, such models will satisfy this Principle of Counterfactual Equivalence. The method has three steps. The first is using data from an economy under a benchmark policy rule to estimate a recursive representation of the equilibrium of dynamic stochastic models (e.g., a SVAR representation). The second is describing assumptions that lead to linear restrictions on these models’ structural equilibrium equations (e.g., exclusion restrictions). They need to be enough to identify all combinations of policy-invariant parameters in these equations — what I call a structure — given the benchmark recursive representation. The last step is solving for the counterfactual recursive representation under an alternative policy rule, given the identified structure. As such, the method is in the spirit of the literature on “sufficient statistics” (e.g., Chetty (2009), Alvarez, Le Bihan, and Lippi (2016)) as well as earlier ideas advanced at the Cowles Commission (e.g., Koopmans (1950), Marschak (1974), Hurwicz (1962)).2 Together, the

1Bernanke, Gertler, and Watson (1997) and Sims and Zha (2006) propose to analyze counterfactual policy rules in SVARs by “zeroing-out” policy responses to shocks. Despite being subject to Lucas Critique, this methodology is com- monly used in many policy institutions, as are macroeconometric models like FRB/US. Del Negro and Schorfheide (2009) studies interest rate counterfactuals in New Keynesian models and asses the sensitivity of results to model misspecification by considering perturbations around their equilibrium SVAR representation. 2For an excellent discussion of these ideas see Heckman and Vytlacil (2007). As they explain, Marschak’s Maxim

1 benchmark and counterfactual representations define a set of structures — which is larger than a singleton containing the identified structure — characterizing models that can generate them, and, therefore, are counterfactually equivalent. The counterfactual is immune to Lucas Critique if the data was generated by a model whose structure is in this set too. I begin the paper by formally describing the principle of counterfactual equivalence, and give necessary and sufficient conditions for a model’s structure to belong to a counterfactually equiv- alent set (Section 2). Then, I present the semi-structural method to construct counterfactuals with respect to policy rule changes (Section 3), which I call Robust Policy Counterfactuals. Parallel to the general treatment, I present an example in the context of fiscal union models where members receive federal transfers according to a transfer policy rule which summarizes how resources are redistributed in a state-contingent manner. In the Online Supplementary Material S.4, I present additional examples in the context of New Keynesian and DMP-style search and matching mod- els. Finally, in order to show how to construct Robust Policy Counterfactuals in practice, I use US state-level data to quantify how fiscal unions contribute to regional stabilization (Section 4) — a question for which, despite plenty of empirical and theoretical research (e.g., Sala-i Martin and Sachs (1991), Asdrubali, Sorensen, and Yosha (1996), Feyrer and Sacerdote (2013), Farhi and Werning (2014)), quantitative work is surprisingly scarce (see Evers (2015) for an exception). There are two main findings. First, in a counterfactual US economy without a transfer policy rule, the standard deviation of employment across states increases by about 1 percent in the Great Recession and 1.5 percent in the long-run. This is because the rule redistributes resources from well to poorly performing states, thus stabilizing regional economies.3 Moreover, had we used the “zeroing-out” methodology in Sims and Zha (2006) to construct the long-run counterfactual, we would have found an increase of only 0.5 percent — thus underestimating the stabilizations gains by ignoring changes in agents’ behavior and expectation-formation as they internalize the policy change. These results can therefore help inform policy-makers when weighing the stabilization benefits against the costs of future fiscal arrangements for the European Monetary Union. Second, exogenous shifters of the Euler equation are a key feature of fiscal union models that generate large stabilization gains, whereas other features like the particular form of wage stickiness or adjustment costs, whether there is discounting due to behavioral agents, or how hand-to-mouth agents are introduced seem to be less crucial. The counterfactually equivalent set I study includes the structures of many models with such features. This finding can thus help organize and rationalize previous quantitative results coming from seemingly disparate models which implied small stabilization gains from fiscal integration (Evers (2015)) and risk-sharing arrangements more generally (as in Backus, Kehoe, and Kydland (1992) and the subsequent literature). says that for many questions of policy analysis it may not be necessary to fully identify a structural model, but only combinations of subsets of the structural parameters. Hurwicz (1962) noted that these should also be policy invariant. 3For a sense of magnitudes, aggregate output volatility declined by about 0.7 percent during the “Great Modera- tion” (Clarida, Gali, and Gertler (2000)). The stabilization gains from fiscal integration are of similar magnitude.

2 2 The Principle of Counterfactual Equivalence

This section describes how to characterize a set of models that are counterfactually equiva- lent with respect to a policy rule change within a class of linear models of dynamic stochastic economies. Parallel to the general treatment, I present an example in the context of models of a small open economy which receives transfers from abroad according to a transfer policy rule. In Online Supplementary Material S.4, I present two more examples in the context of: New Keyne- sian models with an interest rate rule, and search-and-matching models of the labor market with an unemployment benefits rule. I begin by defining a class of models, and develop notation and language that I use through- out. Consider a model j whose equilibrium is generically described by the non-linear conditions:

0 = Et fj (xt+1, xt, xt 1, zt+1, zt) − 0 = Et [θ (xt+1, xt, xt 1, zt+1, zt)] − 0 = g(zt+1, zt) + et+1 iid et+1 with E[et+1] = 0, Var(et+1) = Σ, where xt is a column vector of length ‘k’ that includes all observed endogenous state variables and could include observed endogenous control variables as well, zt is a column vector of length 4 ‘s’ that includes exogenous unobserved state variables, fj(.) is a vector function with codomain Rk, and θ(.) is a policy rule with codomain Rp, where p is the number of policy variables.5 A first order approximation around the non-stochastic steady state of model j results in the following system of equations:

j j j j j 0 = F Et[xt+1] + G xt + H xt 1 + L Et[zt+1] + M zt − 0 = Θ f Et[xt+1] + Θcxt + Θpxt 1 + Θzzt (SME) − 0 = z + Nz + e , − t+1 t t+1

6 where, with some abuse of notation, xt and zt now represent deviations from the steady state. From now on, I say that a model j is characterized by a structure of policy-invariant matrices ξ j Fj Gj Hj (Lj N + Mj) .7 Moreover, a policy is characterized by matrices collecting ≡ endogenoush policy rule parametersi Θ Θ Θ Θ Θ . Elements in structure ξ j involve ≡ f c p z combinations of subsets of parameters derivedh from the fullyi specified model j. They typically lack direct economic interpretation in terms of the primitives of such a model. Elements in Θ

4Note that restricting the system to only one lag is without loss of generality because we can always “stack” equilibrium variables in models with more than one lag. 5Formally, the full description of the equilibrium conditions also includes initial conditions for endogenous and exogenous state variables, as well as terminal conditions. Moreover, note that neither the policy nor the laws of motion for exogenous variables are indexed by j. 6Uhlig (1995), from whom I borrow some notation, studies a very similar system of equations. 7Note that this assumes that the policy rules do not affect the non-stochastic steady state, since ξj will typically include variables evaluated at their steady state values.

3 generally have a direct economic interpretation.

Example. Consider a small open economy that is part of a monetary and fiscal union which receives lump-sum transfers from the federal government τt (see Online Supplementary Material S.1 for a detailed description). Transfers follow the policy rule τt = θnnt with θ < 0 and nt is employment in log-deviations from the union aggregate. The approximate equilibrium equations in log-deviations from the aggregate are:

1 1 φ 0 = E (n n ) + α + (1 α) E (w w ) + ( 1)a + n (Euler) t t+1 − t σ − t t+1 − t σ − t σ t   (1 + ν) 0 = αw + 1 n a (Labor market) − t (1 σ) − t − t  −  B¯ B¯ η¯ 0 = bt + (1 + r) bt 1 + (ηt (wt + nt)) + τt (Budget Constraint) − τ¯ τ¯ − τ¯ − 0 = τ + θ n (Policy) − t n t η 0 = a + ea ; 0 = η + e (Shocks) − t+1 t+1 − t+1 t+1 where wt are wages, bt are bond holdings, at is a productivity shock in the non-tradable good sector, and ηt 1 is the exogenous endowment of tradable goods. Moreover, σ is the intertemporal elasticity of substitution, α is the labor share in non-tradable production, φ is the elasticity of the endogenous discount factor to 8 1 changes in employment, ν is the Frisch labor supply elasticity, and r is the interest rate. 0 0 Then, for instance, letting x n w b τ and z a η , we have that the first line t ≡ t t t t t ≡ t t ξ1 in the structure (corresponding toh the Euler equation)i is: h i

F1. G1. H1. L1. N+M1.

1 φ 1 1 ξ1 1 α + (1 α) 0 0 1 + α + (1 α) 0 0 0 1 0 (Euler) ≡  σ − − σ − σ − 1.4 σ −  z }| { z }| { z }| { z }| {      Next, I restrict the class of linear models in (SME) to those that satisfy two assumptions. Then, I define a recursive representation of the equilibrium. The assumptions ensure that a stable and unique recursive representation of the equilibrium exists, thus ruling out models with unstable dynamics or multiple equilibria. These are beyond the class of models studied in this paper.

Assumption 1. (Stability) ξ, N, Θ are such that the system (SME) is stabilizable.

Under Assumption 1, we have a stable recursive solution to (SME) that can be written as:

xt = P(ξ, Θ)xt 1 + Q(ξ, N, Θ)zt (RR) − zt = Nzt 1 + et − where P(ξ, Θ) and N have all eigenvalues inside the unit circle. In particular, using the method

8As in Schmitt-Grohé and Uribe (2003), this is a way of “closing” small open economy models to induce stationarity.

4 of undetermined coefficients, P(ξ, Θ), Q(ξ, N, Θ) solve the non-linear system of matrix equations

(P)2 QN + PQ ξ(k p).(3k+s)  PQ  − = 0k.(k+s). (SMERR) " Θp.(3k+s) # Ik 0k    0s Is   (3k+s).(k+s)   Assumption 2. (Uniqueness) ξ, N, Θ are such that P(ξ, Θ), Q(ξ, N, Θ) are unique.

Definition 1. (Recursive representation) Γ(ξ, Θ) P(ξ, Θ), Q(ξ, N, Θ), N is the unique and ≡ { } stable recursive representation of the equilibrium under policy Θ in a model characterized by structure ξ.

2.1 Observationally equivalent models under a benchmark policy Θ0

Imagine that we parameterize a model j∗, resulting in structure ξ∗. We consider a benchmark policy Θ0 and obtain the recursive representation of the equilibrium Γ0 Γ(ξ , N0, Θ0). We now ≡ ∗ ask: what are the set of models j that can generate Γ0? Or, in other words, which models are 0 9 observationally equivalent to j∗ under the benchmark policy Θ ? The first step in answering this question is realizing that, regardless of their primitives and particular parameters, any two models j and i with an identical structure ξ = ξ j = ξi will generate an identical recursive representation Γ(ξ, N, Θ) for any Θ. Thus, as the following definition states, the question of observational equivalence can be understood as one about model structures.

Definition 2. (Observational equivalence) Given Θ0, Γ0 , define (Θ0, Γ0) as the set of struc- 0 { } O 0 tures ξ that generate equilibrium representation Γ under benchmark policy Θ . Model j and j∗ are observationally equivalent if and only if model j can be parameterized so that ξ j (Θ0, Γ0). ∈ O 0 0 Because, given policy Θ and model j∗’s structure ξ∗, the recursive representation Γ solves (SMERR), we immediately obtain the following lemma characterizing (Θ0, Γ0). O Lemma 1. (Necessary and sufficient conditions for observational equivalence) A structure ξ belongs to (Θ0, Γ0) if and only if every line ξ for l = 1, 2, .., k p satisfies the following linear O l { − } system, given Γ0:

0 20 00 (P ) P Ik 0k.s 0 (ξl) = 0(k+s).1 (Null OE) (Q0N0 + P0Q0)0 Q00 0 I " k.k s #(k+s).(3k+s)

A number of comments are in order. First, notice that a researcher solving for Γ0 needs to solve a non-linear system (SMERR), given a structure and policy ξ , Θ0 and N0. However, { ∗ } consider the reverse mapping instead. Given values for the recursive representation Γ0 and policy Θ0, (SMERR) implies a linear system of equations that any structure ξ must satisfy in order

9Note that this notion of observationally equivalence is stronger than the usual one because two models cannot be told apart even if the structural recursive representation of the equilibrium (Γ0) was known, whereas the usual definition would involve knowledge of the reduced-form recursive representation alone.

5 to match Γ0 under policy Θ0. This linear mapping then characterizes a set of observationally equivalent models, as stated in the lemma above. Formally, a structure belongs to (Θ0, Γ0) if O and only if each of its lines is in the null space of the (k + s).(3k + s) matrix above and thus solve (Null OE). Second, we can readily see that, because 3k + s > k + s, the linear system (Null OE) is in general underdetermined. This implies that (Θ0, Γ0) includes multiple structures ξ. In other O words, knowledge of Γ0, Θ0 alone are not enough to uniquely identify the model (described by 0 0 ξ∗) which generated such Γ under Θ .

Example. Instead of our previous baseline example, consider a different small open economy model with a “discounted” Euler equation, as in Gabaix (2020) or McKay, Nakamura, and Steinsson (2017) — which we call model ‘1’ and is described in Online Supplementary Material S.2. Given a parameter δ governing discounting, the first line in the structure ξ1 becomes

ξ1 δ αδ + 1 (1 α) 0 0 1 α + 1 (1 α) 0 0 0 1 1 0 . (Euler 1) 1 ≡ σ − − − σ − 1.4 σ − h  i Alternatively, consider a model with a quadratic portfolio adjustment cost, as in Schmitt-Grohé and Uribe (2003) — which we call model ‘2’ and is also described in Online Supplementary Material S.2. Given a parameter ψ governing the cost, the first line in the structure ξ2 becomes

¯ ξ2 1 α + 1 (1 α) 0 0 1 α + 1 (1 α) ψB 0 0 1 1 0 . (Euler 2) 1 ≡ σ − − − σ − σ 1.4 σ − h  i 0 Imagine that we have parameterized the baseline small open economy model, resulting in ξ∗ and Γ under policy Θ0. Online Supplementary Material S.2 shows that, even when models ’1’ and ’2’ do not nest the baseline model,10 there are parameterization of these models such that both ξ1, ξ2 (Γ0, Θ0) for any ∈ O parameterization of the baseline, and, therefore, all models are observationally equivalent under policy Θ0.

2.2 Counterfactually equivalent models when policy changes from Θ0 to Θ1

1 Consider an alternative policy Θ . We again solve model j∗ and obtain the recursive representa- 1 1 tion of the equilibrium Γ Γ(ξ∗, Θ ). We now ask: from those models that were observationally ≡ 0 equivalent to j∗ under policy Θ , which models would also generate the equilibrium represen- tation Γ1 if the policy were to change from Θ0 to Θ1? Or, in other words, which models are 0 1 observationally equivalent to j∗ both under policies Θ and Θ ? As the following definition states, we will call such models counterfactually equivalent.

Definition 3. (Counterfactual equivalence) Given Θ0, Γ0 and Θ1, Γ1 , define (Θ0, Γ0, Θ1, Γ1) { } { } C as the subset of structures ξ in (Θ0, Γ0) that generate equilibrium recursive representation Γ1 under O alternative policy Θ1. Formally, (Θ0, Γ0, Θ1, Γ1) = (Θ0, Γ0) (Θ1, Γ1). Model j and j are coun- C O O ∗ terfactually equivalent if and only if model j can be parameterized so that ξ j (Θ0, Γ0, Θ1, Γ1). T ∈ C 10Notice that, even when δ = 1, (Euler 1) is different than (Euler) because of the endogenous discount factor (governed by φ). Therefore, model ’1’ does not nest the baseline. The same holds for model ’2’ even when ψ = 0.

6 Given the above definition and Lemma 1, we obtain the following lemma characterizing the counterfactually equivalent set (Θ0, Γ0, Θ1, Γ1).11 C Lemma 2. (Necessary and sufficient conditions for counterfactual equivalence) A structure ξ belongs to (Θ0, Γ0, Θ1, Γ1) if and only if every line ξ for l = 1, 2, .., k p satisfies both (Null OE) C l { − } and the following linear system, given Γ0 and Γ1:

(P1)20 (P0)20 P10 P00 F0 − − l. = 0 . (Null CE) 1 0 1 1 0 0 0 0 10 00 0 (k+s).1 " (Q N + P Q )0 (Q N + P Q )0 Q Q # Gl. − − (k+s).2k  2k.1

We have seen before that (Θ0, Γ0) is not a singleton because (Null OE) was underdeter- O mined. Is it the case now that (Null CE) is determined and, therefore, (Θ0, Γ0, Θ1, Γ1) is a C singleton? Propositions 1.A and 1.B show that the answer is again no, and further characterize this counterfactually equivalent set.

0 k p l 0 Proposition 1.A. (Sufficient condition for counterfactual equivalence) If (ξl) = ∑n−=1 cn (ξn∗) for all l = 1, 2, .., k p and some constants cl , then ξ (Θ0, Γ0, Θ1, Γ1). { − } n ∈ C Proof. See Appendix A.

The proposition shows that if a model j can be parameterized so that the lines in its structure j ξ can be written as a linear combination of the lines in ξ∗, then this is a sufficient condition 0 1 for j and j∗ to be counterfactually equivalent when the policy changes from Θ to Θ . This is because every line in ξ∗ is a solution to (Null OE) and (Null CE) and thus so are any linear combination of them. Therefore, (Null CE) is underdetermined with rank less than 2k (k p) − − and the counterfactually equivalent set includes more structures than ξ∗ alone. Are these structures the only ones in (Θ0, Γ0, Θ1, Γ1)? The next proposition shows that this is C not the case when the number of endogenous and exogenous state variables (rank(P1 P0) + s) − is strictly smaller than k + p, where, as reminder, k is the total number of endogenous variables (policy or not) and p is the number of policy variables. Otherwise, the sufficient condition in Proposition 1.A may or may not be necessary as well, depending on the application.

Proposition 1.B. (Sufficient condition is not necessary) If rank(P1 P0) + s < k + p, then − (Θ0, Γ0, Θ1, Γ1) includes structures ξ which do not satisfy the sufficient condition in Proposition 1.A. C Proof. See Appendix A.

Taken together, the propositions imply that there are three “classes” of counterfactually equivalent models. The first class are those models j which can be parameterized so that j ξ = ξ∗, which is a special case of the sufficient condition in Proposition 1.A. While their structures are identical, these counterfactually models are nevertheless different to each other in the sense that they have different microfoundations and, therefore, different nonlinear con- ditions 0 = Et fj (xt+1, xt, xt 1, zt+1, zt) characterizing their equilibrium. The second class are − 11(Null CE) is obtained by subtracting (Null OE) under policy Θ0 from the corresponding system under policy Θ1.

7 j those models whose structures ξ differ from ξ∗, but can nevertheless be parameterized so that j the lines in ξ are linear combinations of those in ξ∗. The last class are models whose structures do not satisfy this sufficient condition, but can still be parameterized to satisfy the necessary and sufficient conditions in Lemma 2.

Example. Online Supplementary Material S.2 shows that, while both models ’1’ and ’2’ are observation- ally equivalent to the baseline model under policy Θ0, only model ’1’ is counterfactually equivalent to the baseline model when policy changes from Θ0 to Θ1. Formally, for the parameterizations that make ξ1, ξ2 belong to (Θ0, Γ0), only ξ1 is also in (Θ0, Γ0, Θ1, Γ1) because ξ1 can be written as δξ + (1 δ)ξ . O C 1 1∗ − 2∗

3 A Methodology for Constructing Robust Policy Counterfactuals

0 Using data on variables xt from an economy under a benchmark policy rule Θ , a researcher has estimated a recursive representation of their dynamics Γ0.12 The researcher would like to evaluate how the behavior of xt would change in a counterfactual economy with an alternative, 1 counterfactual policy rule Θ . That is, she would like to construct a counterfactual Γ1. Because she is uncertain about many features of the economy, she is reluctant to specify a particular structural model where the counterfactual results may not be robust to variation in such fea- tures. However, she is also worried that leading semi-structural methods for performing this type of counterfactual exercise in SVARs may not be appropriate because they may not be im- mune to Lucas Critique (e.g., Bernanke, Gertler, and Watson (1997) and Sims and Zha (2006)).

Especially, because she is concerned about how the behavior of xt would change in the new long-run equilibrium once all agents understand the policy rule has changed, as opposed to the short-run equilibrium dynamics immediately after the policy change where it is plausible that agents may not have understood that such change has occurred. I now present a semi-structural methodology for tackling this problem. It allows researchers to construct Robust Policy Counterfactuals, i.e., counterfactuals with respect to policy rule changes that are both immune to Lucas Critique and robust to variations across models as long as their structures belong to a counterfactually equivalent set (Θ0, Γ0, Θ1, Γ1). C As before, the set of observationally equivalent models j that could have generated Γ0 are those characterized by a structure ξ j (Θ0, Γ0). Yet, as we mentioned, knowledge of Θ0, Γ0 ∈ O alone is not sufficient to construct a unique counterfactual under policy Θ1. For example, some observationally equivalent models will generate a counterfactual Γ1 whereas others Γ10. That is, their structures belong to different counterfactually equivalent sets: (Θ0, Γ0, Θ1, Γ1) and C (Θ0, Γ0, Θ1, Γ10). In fact, this gives us one way of understanding Lucas Critique. If the “true C model” generating the data on x (and thus Γ0) is characterized by structure ξ (Θ0, Γ0, Θ1, Γ1), t ∗ ∈ C 12Note that this starting point assumes that we have estimated not only the reduced-form representation of the equilibrium but have also identified the impulse response matrix Q0. The next section describes how we can do so by first estimating a reduced-form VAR and then imposing additional restrictions to recover Γ0.

8 then a counterfactual is not immune to Lucas Critique when it is constructed from a model char- acterized by a structure ξ / (Θ0, Γ0, Θ1, Γ1). ∈ C Thus, we need to impose additional restrictions on the set of models that could have gener- 0 ated Γ in order to construct a unique counterfactual. In this paper, I will consider nl indepen- dent linear restrictions imposed directly on the structure of such models which take the form of l l R (ξ )0 = r for each line l 1, 2, .., k p . For instance, as the example below and the ap- nl l. nl ∈ { − } plication in the next section illustrates, these could be exclusion restrictions on certain variables, implying that they do not appear in a given equilibrium equation l. The central question of how to construct a Robust Policy Counterfactual can now be stated as: What additional restrictions Rl , rl need to be imposed on structures ξ (Θ0, Γ0), so nl nl ∈ O that knowledge of Γ0, Θ0 uniquely identifies a structure ξ¯ and, therefore, results in a unique counterfactual Γ(ξ¯, Θ1)? Moreover, what is the set of counterfactually models described by (Θ0, Γ0, Θ1, Γ(ξ¯, Θ1))? C The following theorem shows the answer to both questions. First, imposing nl = 2k indepen- dent linear restrictions for each line l = 1, ...k p is necessary and sufficient. This identifies a { − } unique ξ¯ and allows us to construct a unique counterfactual Γ(ξ¯, Θ1). The answer to the second question comes from Lemma 2 and Propositions 1.A and 1.B, which imply that there are many ξ’s which belong to (Θ0, Γ0, Θ1, Γ(ξ¯, Θ1)). C Theorem 1. (Constructing a Robust Policy Counterfactual) Let Rl , rl describe a set of 2k { 2k. 2k.1} independent linear restrictions on the coefficients in line l = 1, ...k p of a structure ξ. Given Θ0, Γ0 { − } { } and Rl , rl , assume that structure ξ¯ is such that: { 2k. 2k.1} (i). ξ¯ is a solution to (Null OE), and thus Γ(ξ¯, Θ0) = Γ0.

(ii). ξ¯ satisfies Rl (ξ¯ ) = rl for each line l = 1, ...k p . 2k. l. 0 2k.1 { − } Then, the following are true:

(a). ξ¯ is unique.

(b). Given ξ¯, Θ1, there is a unique Robust Policy Counterfactual Γ(ξ¯, Θ1) which is constructed as the solution to (SMERR).

(c). Any model j characterized by structure ξ j will generate Γ(ξ j, Θ0) = Γ0 and Γ(ξ j, Θ1) = Γ(ξ¯, Θ1) whenever ξ j C(Θ0, Γ0, Θ1, Γ(ξ¯, Θ1)), as characterized by Lemma 2 and Propositions 1.A and 1.B. ∈ Proof. See Appendix B.

The key to proving Theorem 1 comes from Lemma 1. We noticed before that in order to find the equilibrium recursive representation Γ0 under policy Θ0, one has to solve the non-linear system of equations (SMERR), given structure and policy ξ¯, Θ0 . However, the system of equa- { } tions (Null OE) is linear in structure ξ¯, given values for the recursive representation Γ0 and policy Θ0. This linear mapping is, nevertheless, underdetermined. Yet, imposing 2k additional and

9 independent linear restrictions on each line of ξ¯ suffices to make the linear system exactly deter- mined and uniquely identify ξ¯. Moreover, the theorem shows that to construct the Robust Policy Counterfactual Γ(ξ¯, Θ1) we only need to solve the non-linear system (SMERR) when the policy is Θ1 instead of Θ0. As we have shown and characterized in Lemma 2 and Propositions 1.A and 1.B, we can then construct a counterfactually equivalent set (Θ0, Γ0, Θ1, Γ(ξ¯, Θ1)) around C this counterfactual. Lastly, the counterfactual is immune to Lucas Critique if the “true” model’s structure that generated Γ0 under policy Θ0 belongs to this counterfactually equivalent set.13 The example below illustrates the restrictions satisfied by the fiscal union models from the previous section. Section 4 then shows how to apply the methodology in the context of richer fiscal union models.14

j Example. The first line ξ1 in the structures of the baseline model and models ’1’ and ’2’ all satisfy the 1 1 exclusion restriction below ( R , r ) because wt 1 does not show up in any of their Euler equations. { 1 1} −

1 j j 1 R (ξ )0 0 1 0 (ξ )0 = 0 r 1 1 ≡ 1.9 1.4 1 ≡ 1 h i Also, ξ j in the baseline model and model ’2’ satisfy the restriction below ( R1, r1 ) because the coefficients 1 { 2 2} associated with w and w add up to zero. However, ξ1 in model ’1’ does not satisfy it when δ = 1. t+1 t 1 6

1 j j 1 R (ξ )0 0 1 0 0 0 1 0 (ξ )0 = 0 r 2 1 ≡ 1.8 1 ≡ 2 h i 4 Application: Regional Stabilization in Fiscal Unions

In this section, I use state-level US data to quantify how fiscal unions contribute to regional stabilization by redistributing resources across its members through a transfer policy rule. I begin by describing models of fiscal unions in some generality. Then, I construct a Robust Transfer Policy Counterfactual that corresponds to a US economy without a transfer policy rule. I will focus on models that satisfy four properties. Examples of models that satisfy them are those in Beraja, Hurst, and Ospina (2015), and Online Supplementary Material S.1 and S.2.15

Assumption 3. Models of fiscal unions satisfy the following properties:

1. Transfer policy rule: The tax-and-transfer system can be summarized by federal lump-sum trans-

fers τt that are a function of state-level economic variables.

2. Linear aggregation: State-level economies in log-deviations from the aggregate union behave to a first-order approximation as if they were small open economies—independent of other states.

13As is the case when performing counterfactuals with a fully specified model, we cannot test whether such model is the “true” data-generating process and, thus, the exercise is immune to Lucas Critique. Likewise, we cannot test whether the “true” model’s structure belongs to the counterfactually equivalent set. Yet, this set at least includes more than a single fully specified model. 14In the Online Supplementary Material S.4.2, I also show how to construct counterfactuals with respect to interest rate rule changes in the context of 3-equation New Keynesian models. 15Farhi and Werning (2014), Nakamura and Steinsson (2014), and Evers (2015) satisfy some of these properties too.

10 3. 4 by 3: Employment nt, nominal wages wt, assets bt, transfers τt; and exogenous processes γ , η , z are sufficient variables for characterizing the state-level equilibrium in log-deviation from { t t t} aggregates.

4. SVAR: The log-linearized equilibrium has a unique, finite, and stable structural vector autoregres- sion representation.

Properties 1-3 imply that we can characterize the equilibrium in any state in log-deviations from the aggregate union with a system of equations as (SME), where x n w b τ 0, t ≡ t t t t h i z γ a η 0, k = 4, s = 3, and p = 1.16 Moreover, without loss of generality, I will say t ≡ t t t that theh first equationi in (SME) is the (Euler) equation in these models, the second is the (Labor Market) equation, and the third is the sequential budget constraint (Budget Constraint). Property 4 requires not only that a stable recursive representation exists (Assumptions 1 and 2), but that it can be written as a finite SVAR. Later, I estimate such SVAR and use it to infer Γ0 under the benchmark policy Θ0—a necessary input for constructing a Robust Policy Counterfactual. Given Property 1 in Assumption 3, I assume that the benchmark transfer policy rule depends on state-level employment, current wages and asset holdings at the beginning of the period.

Assumption 4. The benchmark policy Θ0 is: Θ0 = ϑ ϑ 0 1 , Θ0 = 0 0 ϑ 0 , and c n w − p b 0 0 Θ f = Θz = 01.4 h i h i Next, following Theorem 1, I impose enough linear restrictions on each line in the structure ξ¯ of fiscal union models in order to construct a unique counterfactual. I assume they take the form of exclusion restrictions (and a normalization of one of the elements to 1). Specifically,

Assumption 5. The structure ξ¯ satisfies the following restrictions:

f11 f12 0 0 g11 g12 0 0 h11 0 0 0 ¯ ¯ ¯ F =  f21 f22 0 0  ; G =  g21 g22 0 0  ; H =  h21 h22 0 0  ; 0 0 0 0 g g g 1 0 0 h 0    31 32 33   33        1 0 0 0 m12 0 n11 n12 n13 ¯ ¯ L =  0 0 0  ; M =  0 1 0  ; N =  0 n22 n23  0 0 0 0 0 m 0 n n    33   32 33        The key feature of models described by this structure is that the dependence of the (Labor Market) and (Euler) equations — the first and second lines in ξ¯ — on future and lagged variables is relatively unconstrained, as well as the exogenous processes and their correlation structure. This is important for the question of regional stabilization in fiscal unions because it means that the counterfactually equivalent set will encompasses many models with rich features — going beyond the simpler examples in previous sections. For instance, models with varying

16Property 1 excludes models where the tax-and-transfers system affects decisions at the margin, as it would be the case with distortionary taxation. In Online Supplementary Material S.3.5, I relax this assumption and analyze the sensitivity of the results.

11 microfoundations for wage rigidities that are both forward- and backward-looking, employment adjustment costs, different utility functions, or behavioral features that affect not only (Euler). Moreover, the absence of expected and lagged terms beside assets in the third equation makes this structure consistent with most log-linearized, incomplete market models that include a (Budget constraint). Also, I assume that the only exogenous shifter in the sequential budget constraint is the wealth process ηt. The other two exogenous processes do not appear in the sequential budget constraint. In terms of the (Euler) and (Labor Market) equations, I assume that (i) assets and transfers (future, contemporaneous, or lagged) do not appear in either of them, (ii) lagged wages do not appear in the first equation, (iii) the discount rate γt process does not appear in the second equation. Finally, I assume that past discount rate shocks do not cause movements in current productivity at or wealth ηt, as is evidenced by the autoregressive matrix N.

4.1 A Counterfactual Economy without Fiscal Integration

Using US state-level data on employment, wages, assets, taxes and transfers, I construct a coun- terfactual US economy without a transfer policy rule in place (i.e., Θ1 = 0). For reasons of space and conciseness, I leave to Online Supplementary Material S.3 a detailed description of the data used as well as all the steps in constructing a Robust Transfer Policy Counterfactual: (i) the esti- 0 mation of the regional SVAR and the US transfer policy rule Θ , (ii) the inference of Γ0 given the estimated SVAR, and (iii) the identification of ξ¯ and construction of Γ(ξ¯, Θ1). As a primer, the left panel of Figure 1 shows a scatter plot of net federal transfers growth (direct federal transfers minus federal taxes growth) between 2006 and 2010 against nominal wage income growth (wage plus employment growth) between 2006 and 2010 in the United States. There is a very strong, negative relationship between the two.17 If the tax-and-transfer system helps stabilize regional economies, it is because a state whose economy worsens (relative to the average) receives some relief through federal transfer payments or lower taxes. The right panel of Figure 1 shows the responses to a one-standard-deviation discount rate shock γ, both using the actual estimated regional SVAR for the US and the counterfactual SVAR corresponding to a US economy without a transfer policy rule (Θ1 = 0). I find that employment and wages both decrease on impact, whereas assets increase in response to a discount rate shock. While these responses result from the restrictions imposed in the SVAR to identify the shocks (described in Online Supplementary Material S.3), it is reassuring that they agree with the theo- retical responses in typical small open economy models. As for the effects of fiscal integration, I find that amplification and persistence of discount rate shocks are mitigated by the transfer pol- icy rule—e.g., the employment response (after two years) is -1.2 percent in the actual economy, whereas it is -2.1 percent in the counterfactual economy without transfers. Table 1 presents moments of the employment distribution in the actual and counterfactual 2010 economies. The cross-state employment standard deviation in the US data in 2010 (σn ) was 2.6 percent (this corresponds to the last column in the left panel). I then consider the following

17This is due to both the progressivity of the tax system and automatic stabilizers like unemployment insurance.

12 1.5 .8

Slope=-1.25 (0.18) 1

0.5 .6

0

−0.5 .4

−1 Deviation from aggregate (%)

.2 −1.5 Employment Counterfactual Employment Wage −2

Net federal transfers log-growth (2006-2010) transfers log-growth Net federal Counterfactual Wage Assets

0 Counterfactual Assets −2.5 -.15 -.1 -.05 0 .05 .1 0 2 4 6 8 10 12 14 16 18 20 Years Nominal wage income log-growth (2006-2010)

Figure 1: Transfers policy rule and responses to γt shock. Left panel: Log-growth of net transfers in a state between 2006 and 2010 against nominal wage income log-growth during the same period. Circle sizes are state population size in 2006. The line is from an OLS population-weighted regression; Right panel: SVAR implied 0 1 responses to a 1SD γt shock in the actual economy with transfers (Θ ) and the counterfactual one without (Θ = 0). thought experiment. At the end of 2007, it is announced that from 2008 onwards the United States federal government would cease to give transfers according to Θ0 and instead would have the policy rule Θ1 = 0. How would regional economies have evolved if they had been hit by the same sequence of shocks? I find that the counterfactual standard deviation of employment in 2010 would have been 3.5 percent. To give some context to these numbers, aggregate output volatility during the pre-Volcker period (1960:1 to 1979:2) was 2.7, whereas during the post- Volcker-disinflation period (1982:4 to 1996:4) volatility was 2.06.18 Much literature examines the causes of this “Great Moderation.” The consequences of the US federal tax-and-transfer system are in the same order of magnitude. To summarize, these results imply that the federal tax-and- transfer system stabilized regional economies by redistributing resources from regions that were doing relatively well to regions that were doing relatively poorly during the Great Recession.

γ (γ, a)(γ, a, η) Γ(ξ¯, Θ) Sims-Zha 2010 0 0 σn Θ 2.3 2.5 2.6 σn Θ 3.5 3.5 Θ1 = 0 3 3.4 3.5 Θ1 = 0 4.9 4 0 0 σn Θ 2.3 2.7 3.5 sn(0) Θ 7.8 7.8 1 Θ = 0 3.9 4.5 4.9 p Θ1 = 0 10.7 8.4

2010 Table 1: Employment statistics: Fiscal integration v. Fiscal autonomy. Left panel: σn is the SD of 0 employment nt across states in 2010 in percentages. σn is the SD in the stationary distribution. Line Θ corresponds to the economy under the benchmark transfer policy rule, and line Θ1 = 0 corresponds to the counterfactual. Each column respectively feeds the shock γ alone, both γ and z, and all shocks together (γ, z, η). Right panel: sn(0) is the spectrum at zero frequency (the long-run variance of nt). Column Γ(ξ¯, Θ) corresponds to results when constructing a Robust Policy Counterfactual. Column “Sims-Zha” follows the methodology in Sims and Zha (2006).

In the lower half of the left panel, I also present Monte Carlo estimates of the standard de- viation (σn) of employment in the stationary distribution. I construct them by sampling with

18These numbers come from Clarida, Gali, and Gertler (2000).

13 replacement 1,000,000 observations from the empirical distribution of shocks, feeding them to the SVAR and calculating the corresponding statistic. Results are qualitatively similar to the ones during the Great Recession. Furthermore, in the first and second columns of the left panel, I calculate the same statistics if regional economies had been hit by only γ shocks or both γ and a shocks. Comparing the first and last columns, I find that most of the employment variation across states (in 2010 or in the stationary distribution) is accounted for by discount rate shocks.19 Moreover, fiscal integration reduced employment dispersion primarily by stabilizing such re- 2010 gional γ shocks. For instance, the total reduction in σn is 0.9 (3.5 minus 2.6) of which 0.7 (3 minus 2.3) is achieved because of stabilization of such γ shocks. In the right panel of Table 1, I compare the results when constructing a Robust Transfer Policy Counterfactual with those from using the methodology proposed by Sims and Zha (2006) instead.20 This is the leading alternative methodology in the literature to gauge the consequences of the endogenous components of policy in the context of SVARs. 21 The top half shows results for the stationary standard deviation of employment, whereas the bottom half shows the (square- rooted) long-run variance, constructed from the spectrum at frequency zero of the multivariate SVAR. I find that fiscal integration also helps stabilizing regional economies when I follow the Sims-Zha methodology. However, the counterfactual increase in employment volatility across states is much smaller in this case than when computed using the Robust Transfer Policy Coun- terfactuals’ methodology. For example, the stationary standard deviation increases by 1.4 (i.e., 4.9 minus 3.5) in the latter whereas it only increases by 0.5 (i.e., 4 minus 3.5) under the Sims-Zha methodology. These results imply that we would have erroneously concluded that fiscal integra- tion is much less relevant in stabilizing regional business cycles if we had used a methodology that is not immune to Lucas critique—i.e., if we did not take into account the change in agents’ behavior as they internalize the change in the transfer policy rule when making decisions and forming expectations.

4.2 What do we learn by using the Robust Policy Counterfactuals’ methodology?

By quantifying the regional stabilization consequences of fiscal integration, the first contribution of the exercise above is to inform policy-makers discussing these issues — for instance, in the context of talks about future fiscal and other risk-sharing arrangements for the European Mon- etary Union. Whereas the theory is well developed (e.g., Farhi and Werning (2014)), there is surprisingly little quantitative research on this question (see Evers (2015) for an exception). The second contribution is to inform models of international business cycles and risk-sharing

19This is in line with findings in Beraja, Hurst, and Ospina (2015), in which data, and particularly the identification strategy, are different. 20Specifically, to “zero-out” the transfer policy response, I construct transfer policy shocks (which are equivalent to η shocks in our class of fiscal union models) that exactly offset the endogenous response of the transfer policy rule. I then feed the SVAR with both the non-transfer policy shocks and these offsetting transfer policy shocks. 21Admittedly, the methodology is typically used to study such consequences in the short-run when it is more plausible that agents do not understand that the policy rule has changed and, thus, Lucas critique is less of a problem. However, here we are interested in the long-run consequences of fiscal integration.

14 arrangements more generally. While it may not be computationally hard to solve many fully- specified models and compute counterfactuals, the methodology in this paper seeks to boil them down to their core essence, thus allowing one to understand which assumptions are relevant for the question at hand and which ones less so. For the question of fiscal integration, we learn, for example, that different assumptions on the particular form of wage stickiness, extra discounting due to behavioral agents, or how hand-to-mouth households are introduced do not seem to be crucial. Many versions of such models are counterfactually equivalent since they are characterized by structures in the counterfactually equivalent set (Θ0, Γ0, Θ1, Γ(ξ¯, Θ1)). At the C same time, we learn that whether models include exogenous shifters of the Euler equation like

γt is indeed crucial. This is because, as shown in Table 1, I find that most of the reduction in employment volatility from fiscal integration comes from stabilizing these type of shocks. The third contribution of the methodology is to help us organize previous numerical results in the literature coming from disparate models. Because transfer rules are a form of risk-sharing arrangement, the quantitatively small reductions in volatility found by Evers (2015) are less sur- prising once we connect them to the literature studying the Backus-Kehoe-Kydland consumption correlation puzzle (see Backus, Kehoe, and Kydland (1992)). Even in leading quantitative models with very incomplete asset markets (e.g., when only a one-period bond is traded) and where many other frictions are present (e.g., nominal rigidities, habits), consumption is more highly correlated across countries than output whereas the opposite holds in the data. This implies that the consumption and employment volatility reductions from better risk-sharing arrangements are small in such models because the equilibrium is rather close to the complete-markets allo- cation. While these models are not, strictly speaking, counterfactually equivalent with respect to changes in risk-sharing policy rules, they are rather close in practice. In contrast, for the set of counterfactually equivalent models I study in Section 4, there are rather large reductions in the volatility of employment from fiscal integration due to the presence of exogenous shifters of the Euler equation—thus implying substantial reductions in the correlation of consumption and output since I assume that consumption is a non-tradable in these set of models. In fact, a recent paper by Itskhoki and Mukhin (2017), shows that these type of shocks can resolve several puzzles in international macroeconomics.

5 Conclusions

I have presented a novel method to construct counterfactuals with respect to policy rule changes which does not require fully specifying a structural model. It rests on an insight about widely used linear dynamic stochastic models, which I have called the Principle of Counterfactual Equiv- alence. The principle says that many models which are observationally equivalent under a bench- mark policy rule also generate an identical counterfactual equilibrium under an alternative one. To show how to apply the method in practice, I quantified how US fiscal integration contributes to regional stabilization by redistributing resources across states through a transfer policy rule.

15 While I have emphasized the benefits and feasibility of the methodology, it also has many limitations when compared to fully structural approaches. First, it is an inherently linear method- ology. Counterfactual questions that involve non-linearities are not easily accommodated in the current framework, although I conjecture that a modified version of the principle of counterfac- tual equivalence could be applied to models that are well approximated by higher order pertur- bations. Second, I have only discussed the positive implications of a counterfactual police rule change. Deriving normative implications will typically require restricting the set of models and welfare functions even further. Lastly, when constructing counterfactuals, the number of linear restrictions that need to be imposed on the structure grows proportionally with the number of variables and equations in the set of models studied. In certain applications, this may make the method impractical.22 Finally, three other uses of the methodology are worth mentioning. I leave them to future re- search. First, suppose that we conduct a counterfactual policy exercise in an off-the-shelf model and find negligible effects. How could we “backward-engineer” a new model that would change those conclusions? For instance, because we wish to know how robust the conclusions are or because we observe non-negligible effects empirically and we are interested in model extensions that would generate them. The method in this paper suggests an answer. We can try many different linear restrictions in an ad-hoc fashion and construct counterfactuals. We can then see which restrictions implied non-negligible effects and think about microfoundations for models that satisfy them. The second use is constructing an “average of counterfactuals” across different counterfactually equivalent sets by giving them different a-priori weights. The last use is finding a policy rule that induces some equilibrium properties across many different sets of counterfactu- ally equivalent models. For example, we could ask: what features of an interest rate rule achieve low output volatility across many sets of counterfactually equivalent New Keynesian models?

Appendix

A Proof of Propositions 1.A and 1.B

0 F∗ Proof of Proposition 1.A For any line n = 1, 2, .., k p , we know that n. from the struc- 0 { − } " Gn∗. # l ture of model j∗ is a solution to (Null CE), by construction. This implies that, given constants cn, 0 0 Fl. k p Fn∗. a linear combination ∑ − cl is also a solution. Replacing any such solution 0 n=1 n 0 " Gl. # ≡ " Gn∗. # H0 into (Null OE), we obtain the remaining elements l. in this proposed structure ξ. 0 0 " (LN + M)l. #

22In other applications though, it may in fact help because both the structures of larger models are more restricted and data on more variables imposes more restrictions on the set of models that could have generated it. For example, when we do not have data on certain variables and we replace them out from equilibrium equations, then the structure will typically satisfy less exclusion restrictions.

16 k p l This shows that, if every line in the structure ξ of a model can be written as (ξl)0 = ∑n−=1 cn(ξn∗)0, then ξ solves both (Null OE) and (Null CE) and, therefore, belongs to (Θ0, Γ0, Θ1, Γ1). C

Proof of Proposition 1.B We begin by defining A(Γ0, Γ1) in system (Null CE)

(P1)20 (P0)20 P10 P00 A(Γ0, Γ1) − − ≡ " (Q1N0 + P1Q1)0 (Q0N0 + P0Q0)0 Q10 Q00 # − − (k+s).2k

We have shown above that A(Γ0, Γ1) must have incomplete rank since the span of the k p 0 − F∗ columns n. are solutions to (Null CE). In particular, we know that it has at least k p 0 " Gn∗. # − linearly dependent columns and thus rank(A(Γ0, Γ1)) min rank(P1 P0) + s, 2k (k p) . ≤ { − − − } Therefore, if the total number of endogenous and exogenous variables rank(P1 P0) + s is strictly − less than k + p, then rank(A(Γ0, Γ1)) < 2k (k p). As a result, there must be other solutions − − 0 F∗ to (Null CE) than the ones in the span of the k p columns n. . Taking any such solution 0 − " Gn∗. # H0 and replacing it into (Null OE), we again obtain the remaining elements l. in a 0 0 " (LN + M)l. # structure that solves both (Null OE) and (Null CE). This shows that, if rank(P1 P0) + s < k + p, − then (Θ0, Γ0, Θ1, Γ1) includes more structures beyond the ones whose lines can be written as C k p l (ξl)0 = ∑n−=1 cn(ξn∗)0.

B Proof of Theorem 1

The proof is by construction. The linear system (Null OE) has 3k + s unknown elements of structure ξ¯ and there are k + s independent linear equations (Null OE is full rank). Then, consider ¯ 1 identifying the first line ξ1 in the structure. If we specify a set Rn1 of n1 = 2k independent linear restrictions, we obtain a linear system of equations that is exactly determined. That is,

F¯ (P0)2 P0 I 0 1.0 0 0 0 k k.k ¯ (k+s).1 0 0 0 0 0 G1.0 (Q N + P Q )0 Q 0 0s.k Is   = ...   H¯   1 10 1 R2k.(3k+s)  0  r    (LN¯ + M¯ )1.0   2k.1     (3k+s).1     Note however that one of the restrictions is always a normalization, so we only in fact need 2k 1 − restrictions. Then, repeating the above for every line l = 1, 2, .., k p in the structure of ξ¯, we { − } identify a unique structure ξ¯ that is both consistent with the imposed linear restrictions and the recursive representation Γ0 under the benchmark policy Θ0. Finally, in order to obtain the Robust Policy Counterfactual Γ(ξ¯, Θ1) under the alternative policy Θ1, we solve the usual non-linear system of matrix equations (SMERR), given ξ¯ and Θ1. This concludes the proof because the Robust Policy Counterfactual is unique and, therefore,

17 identical for all models j characterized by a structure ξ j in (Θ0, Γ0, Θ1, Γ(ξ¯, Θ1)). Necessary and C sufficient conditions for a structure to belong to such set are those laid out in Lemma 2 and Propositions 1.A and 1.B.

References

Alvarez, Fernando, Hervé Le Bihan, and Francesco Lippi. 2016. “The real effects of monetary shocks in sticky price models: a sufficient statistic approach.” The American Economic Review 106 (10):2817–2851.

Asdrubali, Pierfederico, Bent E Sorensen, and Oved Yosha. 1996. “Channels of interstate risk sharing: United States 1963-1990.” The Quarterly Journal of Economics :1081–1110.

Backus, David K, Patrick J Kehoe, and Finn E Kydland. 1992. “International real business cycles.” Journal of political Economy 100 (4):745–775.

Beraja, Martin, Erik Hurst, and Juan Ospina. 2015. “The Aggregate Implications of Regional Business Cycles.” Unpublished Manuscript, .

Bernanke, Ben S, Mark Gertler, and Mark Watson. 1997. “Systematic monetary policy and the effects of oil price shocks.” Brookings papers on economic activity 1997 (1):91–157.

Chetty, Raj. 2009. “Sufficient statistics for welfare analysis: A bridge between structural and reduced-form methods.” Annual Review of Economics 1 (1):451–488.

Clarida, Richard, Jordi Gali, and Mark Gertler. 2000. “Monetary Policy Rules and Macroeconomic Stability: Evidence and Some Theory.” Quarterly Journal of Economics :147–180.

Del Negro, Marco and Frank Schorfheide. 2009. “Monetary Policy Analysis with Potentially Misspecified Models.” The American Economic Review 99 (4):1415.

Evers, Michael P. 2015. “Fiscal Federalism and Monetary Unions: A Quantitative Assessment.” Journal of International Economics .

Farhi, Emmanuel and Iván Werning. 2014. “Fiscal unions.” Unpublished Manuscript, Massachusetts Institute of Technology .

Feyrer, James and Bruce Sacerdote. 2013. “How Much Would US Style Fiscal Integration Buffer European Unemployment and Income Shocks?(A Comparative Empirical Analysis).” The American Economic Review 103 (3):125–128.

Gabaix, Xavier. 2020. “A behavioral New Keynesian model.” American Economic Review 110 (8):2271–2327.

Heckman, James J and Edward J Vytlacil. 2007. “Econometric evaluation of social programs, part I: Causal models, structural models and econometric policy evaluation.” Handbook of econometrics 6:4779–4874.

Hurwicz, Leonid. 1962. “On the structural form of interdependent systems, Logic, Methodology and Philosophy of Science, edited by E. Nagel, P. Suppes & A. Tarski.”

18 Itskhoki, Oleg and Dmitry Mukhin. 2017. “Exchange rate disconnect in general equilibrium.” Tech. rep., National Bureau of Economic Research.

Koopmans, Tjalling Charles. 1950. Statistical inference in dynamic economic models. 10. Wiley.

Lucas, Robert E. 1976. “Econometric policy evaluation: A critique.” In Carnegie-Rochester confer- ence series on public policy, vol. 1. Elsevier, 19–46.

Marschak, Jacob. 1974. “Economic measurements for policy and prediction.” In Economic Infor- mation, Decision, and Prediction. Springer, 293–322.

McKay, Alisdair, Emi Nakamura, and Jón Steinsson. 2017. “The discounted euler equation: A note.” Economica 84 (336):820–831.

Nakamura, Emi and Jón Steinsson. 2014. “Fiscal Stimulus in a Monetary Union: Evidence from US Regions.” The American Economic Review 104 (3):753–792.

Sala-i Martin, Xavier and Jeffrey D Sachs. 1991. “Fiscal Federalism and Optimum Currency Areas: Evidence for Europe from the United States.” NBER Working Paper (w3855).

Schmitt-Grohé, Stephanie and Martın Uribe. 2003. “Closing small open economy models.” Journal of international Economics 61 (1):163–185.

Sims, Christopher A. 1980. “Macroeconomics and reality.” Econometrica: Journal of the Econometric Society :1–48.

Sims, Christopher A and Tao Zha. 2006. “Does monetary policy generate recessions?” Macroeco- nomic Dynamics 10 (2):231–272.

Uhlig, Harald. 1995. “A toolkit for analyzing nonlinear dynamic stochastic models easily.” .

19 SUPPLEMENTARY MATERIAL (NOT FOR PUBLICATION)

S.1 S.1 Baseline Model of a Small Open Economy in a Fiscal and Mone- tary Union in Section 2

Consider an economy comprised of many islands, inhabited by a representative household and firm. The only other agent in the economy is a federal government. Households consume, work, and save/borrow in a non-state-contingent asset—a nominal bond in zero net supply. Firms produce final consumption goods using labor and intermediate goods. By assumption, the final consumption good is non-tradable, intermediate goods are tradable, and labor is not mobile across islands. Finally, each island has an exogenous endowment of intermediate goods. The federal government sets the nominal interest rate on the nominal bond, and gives lump-sum transfers to the islands. Assume that the nominal interest rate follows an endogenous rule that is a function of only aggregate variables (together with a fixed nominal exchange rate, this implies that the islands are part of a monetary union). Also, assume that federal transfers are a function of island-level variables alone. Throughout, I assume that parameters governing preferences and production are identical across islands and the islands only differ, potentially, in the shocks that hit them—these shocks include a shifter of the households discount rate, a productivity shifter in the production function of final goods, and the exogenous endowment of tradable intermediate goods. Finally, I assume that all labor, goods and asset markets are competitive.

y Firms and Households. Final goods producers use labor Nkt and intermediates Xkt in island k at time t and face prices Pkt, wages Wkt, and intermediate prices Qt (equalized across all islands because of assumed tradability). Their profits are

akt y α 1 α y max Pkte (N ) (Xkt) − Wkt N QtXkt y kt − kt − Nkt,Xkt where akt is a productivity shock and α : α < 1 is the labor share. Unlike the tradable goods prices, final good prices (Pkt) vary across islands. Households preferences are given by

∞ 1 σ 1+ν t ρ δ (Ckt) − ν ν E β e− kt− kt N 0 ∑ 1 σ − 1 + ν kt "t=0  − # where Ckt is consumption of the final good, Nkt is labor, δkt is an exogenous processes driving the household’s discount rate. Moreover, I follow Schmitt-Grohé and Uribe (2003) and let ρkt be the endogenous component of the discount factor that satisfies ρkt+1 = ρkt + Φ(.) for some function Φ(.) of the average per capita variables in an island. As such, agents do not internalize this de- pendence when making their choices. This modification induces stationarity for an appropriately chosen function Φ(.) when assets markets are incomplete (as we assume below). Households are able to spend their labor income Wkt Nkt plus profits accruing from firms ηkt Πkt and exogenous endowment of tradable goods Qte , financial income Bkt 1it 1 and transfers − − from the government τkt, where Bkt 1 are nominal bond holdings at the beginning of the period − and it is the nominal interest (equalized across islands given our assumption of a monetary union where the bonds are freely traded) on consumption goods (Ckt) and savings (Bkt Bkt 1). Thus, − − they face the period-by-period budget constraint

ηkt PktCkt + Bkt Bkt 1(1 + it 1) + Wkt Nkt + Πkt + τkt + Qtη¯e ≤ − −

S.2 Federal government. The federal government budget constraint is

g g B + τkt + QtG = B (1 + it 1) t ∑ t 1 − k − where G is some exogenous level of government spending in intermediate goods. The key feature of a fiscally integrated economy is that the federal government has the ability to redistribute resources across islands via transfers τkt. If the islands where fiscally independent such transfers would not be possible. I assume that the federal government announces a nominal interest rate rule it = i(.) as a function of aggregate variables in the economy alone. Moreover, it announces a transfer policy rule as a function of per-capita employment, wages and assets in an island

ϑw ϑn ϑb τkt = τ¯(W˜ kt) (N˜ kt) (B˜kt 1) − Again, agents do not internalize this dependence when making their choices.

Exogenous shocks and processes. I assume the exogenous processes are AR(1) processes, with an identical autoregressive coefficient across islands, and that the innovations are iid, mean zero, random variables with an aggregate and island specific component. First, define γkt δkt δkt 1. ≡ − − Then,

a a akt = ρaakt 1 + σ˜avt + σaekt − γ γ γkt = ργγkt 1 + σ˜γvt + σγe − kt η η ηkt = ρηηkt 1 + σ˜ηvt + σηe − kt z γ η with ∑k ekt = ∑k ekt = ∑k ekt = 0. By assumption, I assume the average of the regional shocks sum to zero in all periods. The "discount rate" process γkt is a shifter of a household’s discount rate, but it can be viewed as a proxy for the tightening of household borrowing limits. The "productivity" process akt can be interpreted as actual productivity, or a shifter of firm’s demand for labor or firm’s mark- ups. Finally, "wealth" process ηkt is modeled as an endowment of intermediate goods but can be interpreted as shifters of the budget constraint that agents face such as exogenous changes in household wealth.

Equilibrium. An equilibrium is a collection of prices P , W , Q and quantities C , N , B , τ , Ny , X { kt kt t} { kt kt kt kt kt kt} for each island k and time t such that, for an interest rate rule it = i(.) and given exogenous processes a , η , γ , they are consistent with household utility maximization and firm profit { kt kt kt} maximization and such that the following market clearing conditions hold:

akt y α β Ckt = e (Nkt) (Xkt) y Nkt = Nkt ηkt G + ∑ Xkt = ∑ η¯e k k g 0 = ∑ Bkt + Bt k

S.3 Aggregation. The first important assumption for aggregation is that all islands are identical with respect to their underlying production and utility parameters.1 The second assumption is that the joint distribution of island-specific shocks is such that its cross-sectional summation is zero. If K, the number of islands, is large this holds in the limit because of the law of large numbers. I log-linearize the model around this steady state and show that it aggregates up to a representative economy where all aggregate variables are independent of any cross-sectional considerations to a first order approximation.2 I denote with lowercase letters an island variable’s log-deviation from the aggregate union equilibrium. Lowercase letters with a tilde denote devi- ations from the steady state. For example, n n˜ n˜ and n˜ ∑ 1 n˜ = ∑ 1 log (N /N¯ ). kt ≡ kt − t t ≡ k K kt k K kt I assume that the monetary authority announces the nominal interest rate rule in log-linearized form: i˜t+1 = ϕπEt[π˜ t+1] where π˜ t is the aggregate inflation rate. Finally, I assume that the endogenous component of the discount factor is such that Φ(.) = φnkt. The following lemma present the aggregation result and shows that we can write the island level equilibrium in deviations from these aggregates.

Lemma S.1. For given a , γ , η , the behavior of w , n , b , τ , p , c , x in the log-linearized { kt kt kt} { kt kt kt kt kt kt kt} economy for each island in log-deviations from aggregates is identical to that of a small open economy where the price of intermediates and the nominal interest rate are at their steady state levels, i.e. q˜ = i˜ = 0 t. t t ∀ Proof. The following equations characterize the log-linearized equilibrium

1 w˜ p˜ = n˜ + σc kt − kt ν kt kt w˜ p˜ = (α 1)(n˜ x˜ ) + a˜ kt − kt − kt − kt kt q˜ p˜ = α(n˜ x˜ ) + a˜ t − kt kt − kt kt 0 = E (mu˜ mu˜ ) + (p˜ p˜ ) + φ(n˜ n˜ ) + γ i˜ t − kt+1 − kt+1 kt+1 − kt kt − t kt+1 − t mu˜ = σc˜ kt − kt  c˜ = w˜ p˜ + n˜ kt kt − kt kt ˜ ˜ ˜ B¯bkt = B¯(1 + r)(bkt 1 + it) + ηη¯ kt η¯(q˜t + x˜kt) + τ¯τ˜kt − − ∑ x˜kt = ∑ η˜kt k k ¯ g ˜ g ¯ ¯ g ˜ g ˜ B bt + τ¯ ∑ τ˜kt + Gq˜t = B (1 + r)(bt 1 + it) k − ˜ τ˜kt = ϑww˜ kt + ϑnn˜ kt + ϑbbkt 1 − i˜ = φ E [p˜ p˜ ] t+1 p t t+1 − t After adding up, the aggregate log-linearized equilibrium evolution of w˜ p˜ , n˜ is character- { t − t t} 1Given that the broad industrial composition at the state level does not differ much across states, the assumption that productivity parameters are roughly similar across states is not dramatically at odds with the data. 2The model we presented has many islands subject to idiosyncratic shocks that cannot be fully hedged because asset markets are incomplete. By log-linearizing the equilibrium we gain in tractability, but ignore these considerations and the aggregate consequences of heterogeneity. As usual, the approximation will be a good one as long as the underlying volatility of the idiosyncratic shocks is not too large. If our unit of study was an individual, as for example in the precautionary savings literature with incomplete markets, the use of linear approximations would likely not be appropriate. However, since our unit of study is an island the size of a state I believe this is not too egregious of an assumption. The volatilities of key economic variables of interest at the state level are orders of magnitude smaller than the corresponding variables at the individual level.

S.4 ized by

0 = E ( (mu˜ mu˜ ) + (1 φ )(p˜ p˜ ) + γ˜ ) t − t+1 − t − p t+1 − t t+1 1 0 = σ(w˜ p˜ + n˜ ) + n˜ (w˜ p˜ ) t − t t ν t − t − t w˜ p˜ = (α 1)n˜ + a˜ + (1 α)η˜ t − t − t t − t mu˜ σ(w˜ p˜ + n˜ ) t ≡ − t − t t which is equivalent to the system of equations characterizing the log-linearized equilibrium in a representative agent economy with a production technology that utilizes labor alone with an elasticity of α, no endogenous discounting and only 2 exogenous processes a˜ + (1 α)η˜ , γ˜ . { t − t t} Next, take log-deviations from the aggregate in the original system and replace ckt, pkt, mukt for their corresponding expressions. When we set ργ = ρa = ρη = 0 and θw = θb = 0, this results in the system displayed in Section 2 characterizing the equilibrium of n , w , b , τ (where { kt kt kt kt} we drop the ’k’ index for convenience).

1 1 ρ 0 = E (n n ) + α + (1 α) E (w w ) + ( 1)a + n (Euler) t t+1 − t σ − t t+1 − t σ − t σ t   (1 + ν) 0 = αw + 1 n a (Labor market) − t (1 σ) − t − t  −  B¯ B¯ η¯ 0 = bt + (1 + r) bt 1 + (ηt (wt + nt)) + τt (Budget Constraint) − τ¯ τ¯ − τ¯ − 0 = τt + θnnt (Policy) − η 0 = a + ea ; 0 = η + e (Shocks) − t+1 t+1 − t+1 t+1 This system is independent of all aggregate variables and is analogous to the system character- izing the equilibrium in a small open economy without movements in the terms of trade and nominal interest rate.

S.2 Examples of Observational and Counterfactual Equivalence in Sec- tion 2

Gabaix (2020) presents a "discounted Euler equation" that arises when households have behav- ioral biases (e.g., inattention to macroeconomic variables or cognitive discounting). McKay, Naka- mura, and Steinsson (2017) offers an alternative micro-foundation in a model with idiosyncratic income risk and borrowing constraints. In both cases, the Euler equation takes the form

1 1 0 = δE (c˜ ) c˜ + E (p˜ p˜ ) i˜ . t kt+1 − kt σ t kt+1 − kt − σ t

When writing in terms of deviations from the aggregate union equilibrium and replacing ckt, pkt (and dropping the ’k’ subscript for convenience), we obtain

1 1 1 0 = δE (n ) n + αδ + (1 α) E (w ) α + (1 α) w + ( 1)a . (Euler 1) t t+1 − t σ − t t+1 − σ − t σ − t     Alternatively, Schmitt-Grohé and Uribe (2003) present a model with quadratic portfolio adjust-

S.5 ment costs. In such case, the Euler equation becomes

1 1 ψB¯ 0 = E (c˜ ) c˜ + E (p˜ p˜ ) i˜ + b˜ . t kt+1 − kt σ t kt+1 − kt − σ t σ kt And, therefore, we obtain,

1 1 ψB¯ 0 = E (n ) n + α + (1 α) E (w w ) + ( 1)a + b . (Euler 2) t t+1 − t σ − t t+1 − t σ − t σ t   S.2.1 Observational Equivalence Next, we want to show the models above are observationally equivalent to our baseline model from the previous section. Consider a parameterization of such baseline σ∗, α∗, ν∗, φ∗, B¯, τ¯, η¯ 0 0 which gives rise to structure ξ∗ and generates recursive representation Γ under policy θn. Then, by construction, we have that the first line in the structure corresponding to the Euler equation

(ξ1∗) satisfies (Null OE). Since the only state variable is bt, this is:

ξ1∗ 0 20 00 (P ) P Ik 0k.s 1   0 0 0 0 0 00 1 (Q N + P Q ) Q 0s.k Is z α∗ + }|(1 α∗) {    σ∗ −    0 0 0 0 0 0000100000  0    z 0 0 0 0 0000010000}| {  φ∗   0 0 0 0 0 2 0 0 0 0 0 0   1 + σ  p13 p33 p23 p33 (p33) p43 p33 p13 p23 p33 p43 0 0 1 0 0 0  − 1 ∗   α∗ + (1 α∗)  =  0 0 0 0 0000000100   − σ∗ −   0 0 0 0 0 0 0 0 0 0 0 0   0   q31 p13 q31 p23 q31 p33 q31 p43 q11 q21 q31 q41 0 0 0 0 1 0      0 0 0 0 0 0 0 0 0 0 0 0   0   q32 p13 q32 p23 q32 p33 q32 p43 q12 q22 q32 q42 0 0 0 0 0 1       04.1       1 1   σ∗ −   0    0   0  0 0 φ∗ 0 0 1  p (p 1 + ) + p (p 1) α∗ + (1 α∗) 13 33 − σ∗ 23 33 − σ∗ − = 0  0      0 0 φ∗ 0 0 0 0 1 1  q p + ( 1 + )q + (q p q ) α∗ + (1 α∗) + 1  31 13 σ∗ 11 31 23 21 σ∗ σ∗   0 0 − φ∗ 0 0− 0 0 −1 −   q p + ( 1 + )q + (q p q ) α∗ + (1 α∗)   32 13 σ∗ 12 32 23 22 σ∗    − − −   Analogously, for the second line ξ2∗, we have that (Null OE) is,

0 0   0 1+ν∗ 0 p13 1 σ 1 p23α∗  − ∗ − −    0   = 0    0 1+ν∗ 0   q11 1 σ 1 q21α∗ 1   − ∗ − − −   0 1+ν∗  0   q12 1 σ 1 q22α∗   − ∗ − −     

S.6 Model ’1’ is observationally equivalent to the baseline. Consider a parametrization σ1, α1, ν1, δ1, B¯, τ¯, η¯ for model ’1’ with a "discounted Euler equation." First, notice that budget constraint is identical to the baseline model. This directly implies that (Null OE) is satisfied for the third line in the 1 1 1 1+ν 1+ν∗ model’s structure ξ . Second, guess that α = α∗ and 1 = . This directly implies that 3 1 σ 1 σ∗ − − 1 (Null OE) is satisfied for the second line in the model’s structure ξ2 as well. Then, to show that 1 1 1 model ’1’ is observationally equivalent to the baseline, we just need to find σ , δ such that ξ1 satisfies (Null OE). That is,

0 0  0 0 1 0 0 1 1 1 1 0 1 1 1  p13 p33δ 1 + p23 p33 α δ + σ1 (1 α ) p23 α + σ1 (1 α )  − − − −  = 0  0   0 0 1 0 0 0 1 1 1 1  0 1 1 1 1   q31 p13δ q11 + q31 p23 α δ + σ1 (1 α ) q21 α + σ1 (1 α ) + σ1 1   0 −0 1 0 0 0 1 1 −1 −1 0 1 −1 1 −   q p δ q + q p α δ + 1 (1 α ) q α + 1 (1 α )   32 13 − 12 32 23 σ − − 22 σ −      1 1 1 σ∗ Guessing that δ = σ 1 φ and σ = 1 , and replacing above shows that the system is satisfied. 1+ ∗− ∗ δ 1+ν∗ σ∗ To conclude, we have shown that whenever the lines in ξ∗ satisfy (Null OE) — which was by construction since Γ0 was generated by the baseline model — then there is a parameterization of model ’1’ such that the lines in ξ1 satisfy it as well.

Model ’2’ is observationally equivalent to the baseline. Consider a parametrization σ2, α2, ν2, ψ2, B¯, τ¯, η¯ 2 2 for model ’2’ with "portfolio adjustment costs." Again, both ξ3 and ξ2 satisfy (Null OE) when 2 2 1+ν 1+ν∗ 2 2 2 guessing that α = α∗ and 2 = . Thus, we just need to find σ , ψ such that ξ satisfies 1 σ 1 σ∗ 1 (Null OE) as well. That is, − −

0 0  2  0 0 0 0 2 1 ψ B¯ 0 p (p 1) + p (p 1) α + 2 (1 α∗) + 2 p  13 33 − 23 33 − σ − σ 33   0    2   0 0 0 0 0 0 2 1 2 1 ψ B¯ 0  q31 p13 q11 + (q31 p23 q21) α + σ2 (1 α ) + σ 1 + σ2 q31  − − − ∗ − 2   0 0 0 0 0 0 2 1 2 ψ B¯ 0   q p q + (q p q ) α + 2 (1 α ) + 2 q   32 13 − 12 32 23 − 22 σ − σ 32    2 σ∗ 1 2 0  Guessing that σ = σ∗ + φ∗ − and ψ B¯ = φ∗ p , and replacing above shows that the system 1+ν∗ 13 is satisfied. To conclude, we have shown that whenever the lines in ξ∗ satisfy (Null OE), then there is a parameterization of model ’2’ such that the lines in ξ2 satisfy it as well.

Model ’1’ is counterfactually equivalent to the baseline. For the parameterization above, we 1 1 have that ξ2 = ξ2∗ and ξ3 = ξ3∗. We next show that, for such parameterization, there are constants 1 1 1 1 1 1 c1, c2 such that ξ1 = c1ξ1∗ + c2ξ2∗ so that ξ1 is a solution to (Null CE). In particular, guessing that

S.7 c1 = δ1 and c1 = 1 δ1, we have that 1 2 − 1 0 1 α∗ + (1 α∗) 0  σ∗ −    0 0  0       0   1 + φ∗   1+ν∗ 1   σ   1 σ∗  1 1 1  − 1 ∗  1  − −  δ ξ1∗ + (1 δ )ξ2∗ = δ  α∗ + (1 α∗)  + (1 δ )  α∗  −  − σ∗ −  −  −   0   0        0   0       04.1   04.1       1 1   1   σ∗ −   −   0   0       1    δ 1 1 δ 1 δ 1 1 1 1 δ α∗ + σ (1 α∗) δ α + (1 α )  ∗ −   σ1 −  0 0    0   0  φ∗ 1+ν    1 + δ1 + (1 δ1) ∗   1   − σ∗ − 1 σ∗     δ1 −   1 −1 1  1 =  α∗ + (1 α∗)  =  α + σ1 (1 α )  = ξ1  − σ∗ −   − −   0   0            0   0       04.1   04.1     1  1 1  δ 1   σ1   σ∗   −   −   0   0        where the last line follows from the fact that, for model ’1’ to be observationally equivalent 1 1 1 σ∗ 1 to the baseline, we have required δ = σ 1 φ and σ = 1 and α = α∗. Therefore, the 1+ ∗− ∗ δ 1+ν∗ σ∗ above shows that for, this same parameterization, model ’1’ is not only observationally but also counterfactually equivalent to the baseline.

ψ2 Model ’2’ is not counterfactually equivalent to the baseline. The coefficient σ2 in (Euler 2) 2 associated with bt makes it such that, in order to write ξ1 as a linear combination of the lines in ξ∗, we would have to combine both the (Euler) and (Budget Constraint) equations (ξ1∗ and ξ3∗) because bt does not show up in the (Labor Market) equation (ξ2∗). However, when doing so, such linear combination would also have non-zero coefficients associated with bt 1, τt and ηt, whereas − 2 (Euler 2) restricts those coefficients to be zero. This implies that it is not possible to write ξ1 as a linear combination of the lines in ξ∗. Is it still the case there is some parameterization of model ’2’ that makes it counterfactually equivalent to the baseline, even when it does not satisfy the above sufficient condition? The answer is no. To see this, note that the parameterization necessary so that model ’2’ is obser- 2 ¯ 0 0 1 vationally equivalent to the baseline has ψ B = φ∗ p13. When policy changes from Θ to Θ , so 0 1 does p13 change to p13. Thus, model ’2’ cannot be parameterized to be observationally equivalent under both Θ0 and Θ1, and, by definition, is not counterfactually equivalent to the baseline.

S.8 S.3 Robust Transfer Policy Counterfactual in Fiscal Union Models

S.3.1 Data description I exclude Alaska, District of Columbia, and Hawaii from the analysis, leaving 48 observations (one for each remaining state) per year, and 6 years (2006-2011) of data. To make state-level nominal wages indices, I use data from the 2000 US Census and the 2001-2012 American Community Surveys (ACS).3 The 2000 Census includes 5 percent of the US population. The 2001-2012 ACS’s include approximately 600,000 respondents between 2001-2004, and about 2 million after 2004. The large sample sizes allow detailed labor market information at the state level. I begin by using the data to make individual hourly nominal wages. I restrict the sample to only individuals who are employed, who report usually working at least 30 hours per week, and who worked at least 48 weeks during the prior 12 months. For each individual, I divide total labor income earned during the prior 12 months by a measure of annual hours worked during the prior 12 months.4 The composition of workers differs across states and within a state over time, which might explain some variation in nominal wages across states over time. To account for this, I run the following regression: ln(witk) = Kt + ΓtXitk + uitk, where ln(wkt) is log-nominal wages for household i in period t residing in state k, and Xkt is a vector of household specific controls. The vector of controls include a series of dummy variables for usual hours worked (30-39, 50-59, and 60+), a series of five-year age dummies (with 40- 44 being the omitted group), 4 educational attainment dummies (with some college being the omitted group), three citizenship dummies (with native born being the omitted group), and a series of race dummies (with white being the omitted group). I run these regressions separately for each year such that both constant Kt and the vector of coefficients on the controls, Γt, can differ for each year. I then take the residuals from these regressions for each individual, uitk, and add back constant Kt. Adding back the constant from the regression preserves differences over time in average log wages. To compute average log wages within a state, holding composition fixed, I average uitk + Kt across all individuals in state k. I refer to this measure as the demographically adjusted, log-nominal wage in time t in state k. The measure of employment at the state level is the employment rate for each state, calculated using data from the US Bureau of Labor Statistics. The BLS reports annual employment counts and population numbers for each state and year. I divide employment counts by population to make an annual employment rate measure for each state. Data on federal transfers net of taxes paid come from the Bureau of Economic Analysis.5 Transfers include retirement and disability insurance benefits, medical benefits, income mainte- nance benefits, unemployment insurance compensation, veterans benefits, federal education and training assistance, and other transfer receipts of individuals from governments. Federal taxes are the sum of personal income taxes that are withheld, usually by employers, from wages and

3I access the data through the IPUMS-USA website https://usa.ipums.org/usa/. See Ruggles, Sobek, Fitch, Hall, and Ronnander (1997). 4Total labor income during the prior 12 months is the sum of both wage and salary earnings and business earnings. Total hours worked during the previous 12 months is a multiple of total weeks worked during the prior 12 months and the respondents’ reports of their usual hours worked per week. For some years, bracketed reports are provided for weeks worked during the prior 12 months, and the usual hours per week worked. In those cases, I take the midpoint of the brackets. 5I access the data through the BEA website on regional GDP and personal income: http://www.bea.gov/iTable/index_regional.cfm

S.9 salaries, quarterly payments of estimated taxes on income that is usually not subject to withhold- ing, and final settlements, which are additional tax payments made when tax returns for a year are filed, or as a result of audits by the Federal Government.6 Given the unavailability of official state-level data on asset positions, I construct a measure of state-level assets as the sum of physical and financial assets. From national account identities, we can derive the law of motion for assets Bt in a given state as:

local Bt = Bt 1(1 + rt) + Yt Ct + τt Gt + vt, − − − where Yt is nominal gross domestic product, Ct is private consumption expenditures, τt are net local transfers (i.e., expenditures minus taxes) from the federal government, Gt are expenditures from the local government, and rt 1 captures the change in asset valuation between t 1 and t. − − Finally, error term vt includes income receipts from abroad minus income payments to foreigners, federal government expenditures not counted as federal transfers (e.g., salaries and wages), and differences in returns between physical and financial assets for which no data are available.7 I obtain Y and C directly from the Bureau of Economic Analysis website. τ Glocal also comes t t t − t from several variables in the BEA. I calculate it as (personal current transfers receipts) - (personal current taxes paid + taxes on production and imports net of subsidies).8 The revaluation of assets term rt is obtained residually to ensure that the growth rate of the sum of local assets across states is consistent with the growth rate of aggregate net worth in the US economy. Having all components in the law of motion for Bt, I calculate assets at each point in time for each state by iterating forward with 2006 as the initial observation. I obtain initial assets in 2006 by aggregating at the state level, the zip code total net worth data from Mian, Rao, Sufi et al. (2013). In order to construct financial assets at the zip code level Mian, Rao, Sufi et al. (2013) they use data on dividends and interest income from the IRS Statistics of Income (SOI). They assume that households hold identical shares of stocks and bonds (they hold the market index portfolio). Given the share of total dividends and interest income received by a zip code they can construct the share of total stocks and bonds held by that zip code. Then, they total financial assets from the Federal Reserve’s Flow of Funds data to zip codes based on these shares. For the value of nominal debt owed by households they use data based on information from Equifax Predictive Services. Then they match the Federal Reserve Flow of Funds data by using the share of Equifax total debt in a zip code to allocate Flow of Funds debt. The final component of the asset measure is the value of housing wealth which they estimate using the 2000 Decennial Census data. They construct total home value as of 2000 in a zip code as the product of the number of home owners and the median home value. Then, they project it forward into later years using the CoreLogic zip code level house price index and an aggregate estimate of the change in homeownership and population growth.

S.3.2 Estimating the transfer policy rule Θ The transfer policy rule is: τt = ϑnnt + ϑwwt + ϑbbt 1 + et − 6Excise, Medicare and Social security federal taxes are not included in this measure. 7 Error term vt accounts for most of the "wealth" exogenous process. The remainder is the error term et in the difference between observed net transfers and estimated policy rule in equation Section S.3.2. 8As long as local government expenditures plus transfers are close enough to local tax revenues (i.e., local gov- ernments have a nearly balanced budget), the calculation is accurate. If not, the difference is absorbed by error term et.

S.10 For regional data to be used to estimate Θ, one of the following must hold: (1) the innovations to the policy rule have no regional component (et = 0)—in which case, a simple OLS regression produces consistent estimates—or (2) valid instruments can be found that isolate movements in nt, wt, bt 1 that are orthogonal to et. The issue of endogeneity arises because of reverse causality. − η When the innovation in the policy rule is part of the "wealth" shock et , employment and wages both cause and are caused by net transfers in the equation above. To deal with the endogeneity of nt, wt, bt 1, I proceed variously. First, I estimate a regression of net transfers onto nominal − wage income alone (assuming θw = θn) using house price growth between 2006 and 2010 as an instrument. This accords with many recent papers, including Mian and Sufi (2014). Contempo- raneous housing price growth strongly predicts contemporaneous nominal wage income growth. The instrument is valid as long as local housing prices are orthogonal to the transfer policy rule shock, which appears plausible. In the second approach, I use "discount rate" and "productivity" shocks in 2008, estimated from (FiscalSVAR), as instruments for wages and employment. They are linear combinations of wages, employment, and assets in 2008 that are orthogonal to the "wealth" shock, and hence et, by construction. Table 3 presents results for several specifications. The dependent variable is the log-growth rate of transfers minus the growth rate of taxes between 2006 and 2010 for each state. The independent variables are the log-growth rate of nominal wages between 2006 and 2010 and the log-growth rate of employment between 2006 and 2010 in the first two columns, and the log-growth rate of assets between 2006 and 2009 in the third column. In the fourth column, the independent variable is the sum of wage and employment growth. The first line is a simple OLS regression. The second presents two-stage, least-squares results using the "discount rate" and γ a "productivity" shocks in 2008 e2008, e2008. The third uses house price log-growth between 2006 and 2008 as an instrument instead. The fourth uses all three instruments. For all specifications, and when possible, I consider case (1) when bt 1 is not endogenous, and case (2) when bt 1 might − − be endogenous. I find that the policy rule estimates have the expected sign and are significant in all specifica- tions. They are also similar in magnitude, ranging from -1.3 to -1.6 for ϑn and -0.9 to -1.4 for ϑw. Lagged assets have nearly no independent explanatory power for net transfers across all speci- fications. To give a sense of the magnitudes involved, when net transfers increase by 30 percent for every 1 percent decrease in nominal wage income, and the average income tax rate is 0.17, for every 1 dollar decrease in nominal wage income, a state receives 0.22 dollars in federal transfers. This result is similar to findings by Feyrer and Sacerdote (2013), who find a 0.25 decrease, and Bayoumi and Masson (1995), who find a 0.31 decrease.

S.3.3 SVAR identification A necessary input in the construction of Robust Policy Counterfactuals is the impulse response matrix Q(ξ, N, Θ). The literature proposes myriad ways to identify it, ranging from simple or- dering assumptions to more sophisticated sign and long-run restrictions. These represent several routes that could be followed as long as their implied linear restrictions on the structure ξ¯ are consistent with the linear restrictions R , r used for constructing the counterfactual. Alter- { nl nl } natively, in this section I show how to use those equilibrium equations of structural models that we feel more confident about in order to derive linear restrictions on the structure ξ that are sufficient to identify Q. Specifically, I use the sequential budget constraint to generate these the- oretical restrictions because many fiscal union models are consistent with it. These theoretical restrictions imply a series of particular linear restrictions linking the reduced form errors to the structural shocks. Hence, this identification scheme fits nicely with the philosophy in this paper

S.11 Table 3: Policy rule baseline estimates

2 ϑn ϑw ϑb ϑw+n R 1.6 0.9 0.03 OLS − ∗∗ − ∗ − . 0.41 (0.5) (0.7) (0.02)

1.3 1.4 0.02 IV w/ shocks (1) − ∗ − ∗ − . 0.41 (0.7) (0.8) (0.02)

0.03 1.1 IV w/ house prices (1) .. − − ∗∗ 0.42 (0.02) (0.4)

1.4 1.2 0.02 IV w/ house prices and shocks (1) − ∗ − ∗ − . 0.43 (0.6) (0.6) (0.02)

1.3 1.4 0.01 (2) − ∗ − ∗ . 0.38 (0.7) (0.7) (0.08)

Note: Numbers in parenthesis are OLS (or second stage) standard errors. Variables with ’∗’ are significant at a 5% level. Variables with ’∗∗’ are significant at 1%. All variables are state log-growth rates between 2006 and 2010. bt 1 is exogenous in (1) and endogenous in (2) − and makes it easy to verify that the restrictions in R , r are not violated. Beraja, Hurst, and { nl nl } Ospina (2015) shows another application of this scheme. Baumeister and Hamilton (2015) present a scheme that is very similar in spirit. Following Ravenna (2007), if Assumptions 1 and 2 hold, and there is a matrix B such that BQ(ξ, N, Θ) is a non-singular square matrix, then there is a SVAR representation of the solution (RR) of the form:

xt = ρ1(ξ, N, Θ)xt 1 + ρ2(ξ, N, Θ)xt 2 + Q(ξ, N, Θ)et (SVAR) − − where ρ (ξ, N, Θ) P(ξ, Θ) + Q(ξ, N, Θ)N(BQ(ξ, N, Θ)) 1B; ρ (ξ, N, Θ) (P(ξ, Θ) ρ (ξ, N, Θ))P(ξ, Θ) 1 ≡ − 2 ≡ − 1 and V(ξ, N, Θ) Var(Q(ξ, N, Θ)et) = Q(ξ, N, Θ)ΣΣ0Q(ξ, N, Θ)0. ≡ 1 To see this, note that we can write zt 1 = (BQ(ξ, N, Θ))− (Bxt 1 BP(ξ, Θ)xt 2) and replace − − − − it and the law of motion for the exogenous states into the law of motion for the endogenous variables to obtain the SVAR(2) representation. Without loss of generality, I normalize the covariance matrix of structural shocks Σ to the identity matrix in what follows. The first step in the procedure consists of estimating the reduced form VAR to obtain the autoregressive matrices ρ , ρ , and the reduced form errors covariance { 1 2} matrix V. The second step is deriving identification restrictions that will allow us to infer Q and the shocks. 1 0 0 0 In what follows, I will assume that B = 0 1 0 0 . That is, a matrix that selects the   0 0 1 0 impulse responses in Q associated with nt, wt, bt. Replacing out the policy rule τt, and applying the conditional expectation operator Et 1(.) on both sides of the third line in the structure (i.e., −

S.12 the (Budget constraint)) and constructing the reduced form expectational errors, we obtain:

γ et a η 0 = g31 + ϑn g32 + ϑw g33 BQ et + m33et (Id1)  η  e   t   γ This equation must hold for all realizations of the shocks. Whenever there is an innovation to et a η or et and et = 0, employment, wages, and debt must co-move on impact in a way that satisfies this linear relationship. Hence, it gives us two linear restrictions in the second and third columns’ γ a elements of BQ when there are either contemporaneous et or et shocks. Similarly, constructing Et 1(.) Et 2(.), we obtain: − − − γ et 1 a− 0 = g31 + ϑn g32 + ϑw g33 Bρ1B0 + 0 0 h33 + ϑb BQ et 1  η−  et 1     − η a   + m33n33et 1 + m33n32et 1 (Id2) − − This gives us one extra linear restriction in the second column’s elements of BQ when there γ are et 1 shocks. These three linear restrictions, combined with six non-linear restrictions coming from− the orthogonalization of the shocks, are sufficient to identify all nine elements in BQ asso- ciated to the impulse responses of nt, wt, bt to each of the three shocks. Intuitively, equation (Id1) separates the "wealth" shock from the other two shocks. If the unexpected component of em- ployment, wages, and assets does not co-move in the linear way implied by equation (Id1), when η et = 0, a "wealth" shock must have occurred. Analogously, equation (Id2) separates "discount rate" and "productivity" shocks. If the unexpected component of employment, wages, and assets z η does not co-move in the linear way implied by equation (Id2), when et 1 = et 1 = 0, a "discount rate" shock occurred. For completeness, matrix BQ solves the system: − −

1 0 g + ϑ g + ϑ g BQ 0 1 = 0 0 31 n 32 w 33   0 0      1  g + ϑ g + ϑ g Bρ + 0 0 h + ϑ BQ 0 = 0 31 n 32 w 33 1 33 b   0       V = BQ(BQ)0

Finally, to construct the full matrix Q from BQ, we just need to append the last line corre- 1 0 0 0 1 0 sponding to the impulse responses of τ . That is : Q =   BQ. t 0 0 1  θ θ 0   n w    S.3.4 Estimating the structure ξ¯ and the recursive representation Γ0 I estimate a vector autoregression on employment, wages, and assets via weighted OLS where the weights are the 2006 population in the state, using data described in subsection S.3.1. For each variable and year, I take the cumulative log-growth between 2006 and 2011 and express it

S.13 in log-deviations from the average across states. I pool all data between 2006 and 2011, leaving 0 0 240 observations (5 years∗48 states), and estimate common autoregressive coefficients ρ1, ρ2 and reduced form errors U covariance matrix for all states V0 = UU0 .9 240 3∗2 0 0 − Given ρ1, ρ2, we find solutions X with all eigenvalues inside the unit circle to the quadratic 0 0 equation ρ2 = (X ρ1)X. Under Assumptions 1-2 and Property 4, there are only two such − 0 solutions. The first corresponds to BP B0 in the unique stable recursive representation of the 10 0 0 0 1 equilibrium under fiscal integration. The second corresponds to (BQ )N (BQ )− . I identify 1 BP0B as the solution that results in an implied N0 = BQ0 − (BP0B ρ0)BQ0 that satisfies the 0 0 − 1 exclusion restrictions described in Assumption 5.  From the restrictions implied by the third line of the structure in Assumption 5, together with results in Theorem 1, we have,

0 0 0 0 g31 + ϑn g32 + ϑw g33 BP B0 + 0 0 h33 + ϑb = 01.3

    0 Then, g31, g32, h33 are identified from the above system of equations, given BP B0 and setting B¯ g33 τ¯ , to match the ratio of median net worth across states to total income tax revenues in the ≡ 0 0 0 United States in 2006, and ϑn = 1.6, ϑw = 0.9, ϑb = 0.03, which correspond to the OLS policy − − −0 0 rule estimates in Table 3. To construct the full matrix P from BP B0, we just need to append the 0 0 0 BP B0 0 last line and column corresponding to τt. That is : P =  0 0 0 0 ,  . ϑn ϑw 0 BP B0 + 0 0 ϑb 0   0         Then, with the above inputs, I follow Section S.3.3 to identify Q0. Furthermore, the rest of the  structure ξ¯ is identified by following the results in Theorem 1 using the restrictions implied by Assumption 5 and the estimated P0, Q0, N0, Θ0.

S.3.5 Sensitivity to alternative policy specifications Results from the previous section are based on a particular transfer policy specification where ϑ = 1.6, ϑ = 0.9, ϑ = 0.03, which corresponds to the OLS estimates of the transfer policy n − w − b rule in Table 3 in Appendix S.3. First, we would like to evaluate the sensitivity of results to estimates of the transfer policy rule other than the benchmark OLS estimates. Thus, I consider alternative initial policy parameteriza- tions corresponding to the instrumental variable estimates in Table 3. Results are similar to those reported in the previous section for the benchmark policy rule. Although quantitatively reduced somewhat for some of the parameterizations, the reduction of dispersion in employment across states due to fiscal integration remains large. The largest difference is for the case in which I re- strict coefficients on employment and wages in the policy rule to be identical (ϑ = ϑ = 1.1). n w − In this case, the counterfactual employment standard deviation in 2010 would have been 3.1 percent (instead of 2.6 percent in the data), and the counterfactual standard deviation in the stationary distribution is 4.5 percent (instead of 3.5). The counterfactual using the benchmark

9 0 0 0 Note that ρ1, ρ2, V do not exactly correspond to the theoretical matrices in (SVAR). The reason is that the the- oretical matrices also include the lines and columns associated with transfers τt, whereas I have estimated the VAR without them. Then, for example, V0 is missing the fourth line compared to V(ξ, N, Θ), which corresponds to the variance of reduced form expectational errors of τt. 10 0 Since τt is not a state variable in the set of models we study, then the fourth column of P is a column of zeros. 0 Then, P B0 is the autoregressive matrix without this fourth column.

S.14 policy estimates instead resulted in counterfactual standard deviations of 3.5 and 4.9 percent respectively. Second, we would like to evaluate an alternative policy that accounts for the distortionary effects of taxation. The benchmark policy specification is consistent with households in each state receiving lump-sum transfers from the federal government because the only affected equa- tion that characterizes the equilibrium is the sequential budget constraint (the third equation in this case). In practice, lump-sum transfers are rare and, instead, the federal government uses distortionary taxes. For simplicity, consider the transfer policy without lagged assets in log-deviations from the aggregate.11

τt = ϑnnt + ϑwwt

This implies that tax rate τ˜t per unit of nominal labor income wt + nt in log-deviations from the aggregate can be written as:

τ˜ = (1 + ϑ )n (1 + ϑ )w t − n t − w t

The potential labor supply (or wage-setting) tax distortion relates to τ˜t, not total transfers τt for which we estimated elasticities ϑw, ϑn. If the federal tax-and-transfer system affects equilibrium equations beyond the sequential budget constraint, (1 + θw), (1 + θn) would appear in these equations, not θw, θn. I consider the case in which the second equation in the structure is a (Labor Market) equation. If the federal tax-and-transfer system is distortionary, we can write this equation as:

τϑ¯ τϑ¯ 0 = f E [n ] + f E [w ] + g + n (1 + ϑ ) n + g + n (1 + ϑ ) w 21 t t+1 22 t t+1 21 1 τϑ¯ n t 22 1 τϑ¯ w t  − n   − n  + h21nt 1 + h22wt 1 + Et[zt+1] + l23Et[ηt+1] + m22zt + m23ηt − − The equation above accords with the tax rate that affects the target wage in the wage-setting equation by distorting the marginal rate of substitution. The distortion is given by terms 1 + θn and 1 + θ . For example, the case θ = θ = 1 is such that the tax schedule is flat (i.e., a w w n − proportional labor income tax). Due to the lack of curvature, it would not affect the island’s log-deviations of the marginal rate of substitution from the aggregate. I next construct a Robust Transfer Policy Counterfactual using this alternative policy spec- ification, setting τ¯ = 0.17 to match the average tax rate in the US economy. I find that the results from the previous section are essentially unchanged because estimates of transfer policy rule ϑ , ϑ are close to 1. Thus, policy-related terms that distort this equation are very small n w − in absolute magnitude, in comparison with terms in the policy-invariant structure. To see this, consider the case in which the second equation is interpreted as a static labor supply equation, τϑ¯ n and the policy rule depends on employment alone: wt = g21 1 τϑ¯ (1 + ϑn) nt. Plausible − − − n calibrations of labor supply Frisch elasticity g are in the range 0.5 to 4.12 The policy-related − 21 term for the case when τ¯ = 0.17, ϑ = 1.6 is -0.08, which is an order of magnitude smaller than n − the Frisch elasticity.

11Since the estimated coefficient on lagged assets is so small, results are identical whether we include it or not. 12See Chetty, Guren, Manoli, and Weber (2011).

S.15 S.4 Additional Examples

S.4.1 Interest rate rule counterfactuals in New Keynesian models The canonical New Keynesian model (e.g., Clarida, Gali, and Gertler (2000)) describes the equi- librium behavior of inflation πt, output yt and nominal interest rate rt in log-deviations from a zero inflation steady state in terms of three equations,

0 = E (y ) E (π ) + y + i + γ (Euler) − t t+1 − t t+1 t t t 0 = βE (π ) + κy π + z (NKPC) t t+1 t − t t 0 = i + θ y + θ π + µ (Policy rule) − t y t π t t where zt, γt, µt are independent AR(1)’s and θp, θy are interest rate policy rule parameters. The standard approach for analyzing counterfactual interest rate rules begins by estimating the structural parameters using data under a benchmark policy rule. Then, one would solve for the recursive representation of the equilibrium under alternative rules and compare properties of such alternative equilibria (e.g., output and inflation volatility). Such approach begets a natural question: would results change if one used a different model? I begin by considering three alternative New Keynesian models. Their different primitives only change the New Keynesian Phillips Curve. In the first model, there is a large family of agents in the household. A fraction of the agents γt are unemployed and consume the transfers from the family alone. Exogenous variation in γt over time generates variation in the intertemporal marginal rate of substitution of consumption and the intratemporal marginal rate of substitution between consumption and labor. This implies that γt now appears in the (NKPC) equation as well. κγ¯ 0 = βE (π ) + κy π + z + γ (NKPC1) t t+1 t − t t 1 + γ¯ t

The second model follows from Gabaix (2020). He derives a behavioral version of the NKPC when firms are partially inattentive to future macro conditions.

0 = βM f E (π ) + κm f y π + z (NKPC2) t t+1 t − t t The degree of inattention is parameterized by M f and m f . For example, the behavioral NKPC features extra discounting because M f < 1 when compared to the canonical model. In the third model, firms face a working-capital-in-advance constraint. The constraint forces them to hold a fraction χ of their sales in cash overnight. Such cash holdings depreciate because of inflation, thus adding to the effective marginal cost.

0 = βE (π ) + κy + (κχ 1)π + z (NKPC3) t t+1 t − t t Next, imagine the following thought experiment. We parameterize the behavioral model and solve for the recursive representation of the equilibrium yt, πt, it under the benchmark policy 0 0 { } θy, θπ. Since the behavioral model (and all the others in this section) do not have endogenous

S.16 state variables, the recursive representation of the equilibrium takes the form of:

yt γt π = Q0 z (RRNK)  t   t  it µt     For the purposes of this section, I will assume that there always existe a recursive representation 13 1 1 and that such representation is stable and unique. Then, consider an alternative policy θy, θπ. Keeping all other parameters unchanged, we also construct the recursive representation of the equilibrium Q1 under the alternative policy. Claim 1. (Observational and Counterfactual Equivalence in New Keynesian models) Let Q0 and Q1 be the recursive representation of the equilibria generated by the behavioral model under policies θ0, θ0 and θ1, θ1 . Then, { y π} { y π} (1) (Observational Equivalence) Both the unemployed-agents model and the working-capital model can be parameterized to match Q0 under policy θ0, θ0 . { y π} (2) (Counterfactual Equivalence) Given such parameterization, the working-capital model also generates Q1 under policy θ1, θ1 . The unemployed-agents model does not. { y π} The first claim says that all three models satisfy a strong form of observational equivalence. These models could not be distinguished even if we observed both an infinite time-series of inflation and output, and the structural shocks that generated such time-series. This is because they have an identical recursive representation under the benchmark policy.14 The second claim says that the behavioral and working-capital model are counterfactually equivalent with respect to changes in interest rate policy rules, but the unemployed-agents model is not. We can now show Claim 1 by inspecting the New Keynesian Phillips Curve associated with the behavioral, unemployed-agents, and working-capital models (the Euler equation is identical in all three and can be omitted):

yt+1 yt γt κm f 1 0 β 0 E π + + 0 π + 010 z = 0 (Behavioral) t  t 1  M f M f  t   t  i − i µ   t+1 h i t   t       yt+1 yt γt 0 β 0 E π + κ κχ 1 0 π + 010 z = 0 t  t+1  −  t   t  i i µ   t+1   t   t       (Working-capital)

yt+1 yt γt 0 β 0 E π + κ 1 0 π + κγ¯ 1 0 z = 0 t  t+1  −  t  1+γ¯  t  i i µ   t+1   t h i t      (Unemployed-agents)

Note that, when compared with the behavioral model, both the unemployed-agents and working-capital models impose the same number of restrictions in the structure and have the

13Well known issues of multiplicity, indeterminacy, and instability in New Keynesian models are beyond the scope of this section. 14A weaker (and more typical) form of observational equivalence would only require that these models cannot be distinguished from observations of inflation and output alone. Even if they had different recursive representations, they could match the same data with alternative realizations of the structural shocks.

S.17 same number of free parameters. Because of the linearity in the reverse-mapping (Null OE) in Lemma 1 of the paper, this implies that both models can be parameterized to match Q0. Thus, they are all observationally equivalent. However, while the working-capital and behavioral mod- els impose identical restrictions (the coefficients associated with Et[yt+1] and γt are zero, and the one associated with Et[πt+1] is β), the unemployed agents model imposes a different one (the coefficient associated with π is equal to 1). This implies that the working-capital model can t − be parameterized to have an identical structure to the behavioral model. This a special case of the sufficient condition in Proposition 1.A in the paper. As such, the working-capital and behav- ioral models will generate an identical counterfactual equilibrium Q1 under an alternative policy Θ1. However, for the parameterization that makes the unemployed-agents model observationally equivalent, it is not possible to write (NKPC3) as a linear combination of (NKPC1) and (Euler). Neither it is the case that for this parameterization (NKPC3) solves (Null CE) in Lemma 2 of the paper. Thus, its structure does not belong to the same counterfactually equivalent set than the behavioral and working-capital models.

S.4.2 Interest rate rule counterfactuals in SVARs Using data from an economy under a benchmark interest rate rule θy, θπ , a researcher has { 0 0 } estimated the following SVAR representation of output yt, inflation πt and interest rates it.

γ yt yt 1 et 0 − 0 z πt = ρ πt 1 + Q et    −   µ  it it 1 et    −    The researcher would like to evaluate how the behavior of inflation and output would change in a counterfactual economy with a more "hawkish" interest rate policy that only responds to y π π changes in inflation (i.e., θ1 = 0 and θ1 > θ0 ). Next, I present a simplified exposition of the methodology in Theorem 1 of the paper, as it applies to 3-equation New Keynesian models. I start by assuming that the estimated the recursive representation is a SVAR ρ0, Q0 and { } was generated by some model that belongs to the class of 3-equation New Keynesian models described in the previous section. As such, its equilibrium equations can be described by a struc- ture and a policy ξ, Θ . Furthermore, if the impulse response matrix Q(ξ, N, Θ) is invertible, { } then the equilibrium has a finite SVAR representation. This is:

γ yt yt 1 et − z πt = ρ(ξ, Θ) πt 1 + Q(ξ, N, Θ) et    −   µ  it it 1 et    −    1 where ρ(ξ, N, Θ) Q(ξ, N, Θ)NQ(ξ, N, Θ)− . ≡ 0 0 0 1 0 0 0 0 0 First, note that N is recovered as N = (Q )− ρ Q . Second, given N , Q , Θ , we can construct the linear system (Null OE) in the paper, because any structure ξ matching this repre- sentation satisfies:

FQ0N0 + GQ0 + M = 0

As we have argued in Theorem 1, the structure is identified by imposing enough restrictions per line. For example, consider the restrictions satisfied by the behavioral and working-capital

S.18 models from the previous section:

f f 0 g 0 1 1 0 0 F = 11 12 ; G = 11 ; M = 0 f 0 g g 0 0 1 0  22   21 22    The system (Null OE) gives 3 equations in this example. Adding the linear restrictions in each line above results in 3 additional equations. Thus, we have a linear system of 6 equations in the 6 unknowns f11, f12, g11, f22, g21, g22. Then, given the identified restricted structure ξ¯ and the alternative "hawkish" interest rate policy rule Θ1, we can solve for the counterfactual SVAR impulse response matrix Q(ξ¯, N, Θ1) by solving the usual non-linear system (SMERR) in the paper. Finally, the counterfactual autore- 0 1 1 0 0 1 1 gressive matrix is ρ(ξ¯, N , Θ ) = Q(ξ¯, Θ )N Q(ξ¯, N , Θ )− . To conclude, the counterfactual SVAR representation under the alternative interest rate pol- icy rule is robust to variations across many 3-equation New Keynesian models, provided they can match the SVAR representation under the benchmark policy rule and belong to the same counterfactually equivalent set (i.e., solve both (Null OE) and (Null CE)).

S.4.3 Unemployment Benefits in Search-Theoretic Models of the Labor Market

I start by reviewing the key equations characterizing the equilibrium wage (wt) and vacancy-to- unemployment ratio (ϑt) in the discrete-time version of a canonical search-and-matching model with Nash Bargaining and free entry of firms (Mortensen and Pissarides (1994); Pissarides (2000)).

1 φ φc 0 = w + y + ϑ + z + b (Wage Setting (1)) −1 φ t 1 φ t 1 φ t t − − − c (1 s)c 0 = + βE y + w + + − (Job Creation (1)) − q(ϑ ) t t 1 − t 1 q(ϑ ) t  t+1  The (Wage Setting (1)) equation determines how the surplus of a match is split between the worker and the firm. Thus, it relates the wage wt to the (exogenous) productivity of the match yt, the cost of posting a vacancy c, the vacancy-unemployment ratio ϑt, the bargaining power of the worker φ, the utility while unemployed z and, importantly for our purposes, unemployment benefits bt. The (Job Creation (1)) equation determines firms’ incentives to post vacancies and, because of free-entry, equates the cost of posting a vacancy to the expected benefit of a match which depends on the discount factor β and the vacancy-filling probability q(θt). Θ ˜ ϑt ˜ Next, imagine we are interested in evaluating the policy rule bt = bt + ϑ¯ , where bt is an exogenous policy shock and Θ > 0 governs how generous benefits are when  labor market slackness is above or below its long run level.15 Moreover, suppose that after estimating the canonical model with data on wages and the vacancy-unemployment ratio alone, we conducted a series of counterfactual exercises and found that the policy Θ have negligible effects on the equilibrium behavior of these variables. How should we proceed if either (i) we wished to know how robust this quantitative result is to variation in model primitives or, relatedly, (ii) we wished to construct models that generated non-negligible effects? I argue that the principle of Counterfactual Equivalence offers us some guidance. To see this, we can log-linearize the system of equilibrium equations around the steady-state, which gives

15As a motivation for this policy rule, unemployment benefits were substantially extended during the Great Reces- sion—evidencing that its generosity may depend on the state of the business cycle.

S.19 rise to a structure with the following restrictions,16

0 0 0 g g g 0 0 0 0 0 m 0 F = ; G = 11 12 13 ; H = ; L = ; M = 12 f f 0 0 g 0 0 0 0 0 l 0 0  21 22   22     22    By noting the exclusion restrictions that this structure satisfies, we can readily observe that the (Wage Setting (1)) equation has a rather restricted structure compared to the (Job Creation (1)) equation. This observation, when combined with the results from Section 2 in the paper, implies that focusing on primitives or mechanisms that change how wages are set is a promising avenue for building models that could generate non-negligible effects of unemployment benefits policy (in the hypothetical case that the canonical model generated negligible effects). Analo- gously, models that do not significantly alter the (Job Creation (1)) equation (i.e., by changing the exclusion restrictions) will likely be counterfactually equivalent to the canonical model, thus implying that the negligible effect of changes in the unemployment benefits policy rule would be a robust feature of models with varying micro-foundations regarding how firms post vacancies. For example, if we replaced the assumption of Nash Bargaining with an ad-hoc wage rule ala Hall (2005), we would obtain a wage setting equation of the form,

0 = wt + zt + λyt + (1 λ)wt 1 (Wage Setting (2)) − − − where λ governs how "sticky" wages are. Or, if we replaced it with the alternating-offer-wage-bargaining protocol in Christiano, Eichen- baum, and Trabandt (2016), we would obtain the wage setting equation,

0 = ω (1 β (1 s ϑ q(ϑ ))) γ + (1 + ω )(w z ) (ω + ω )(y z ) cω ϑ − 2 − − − t t 1 t − t − 1 3 t − t − 1 t + β (1 s ϑ q(ϑ )) ω E [y z ] (Wage Setting (3)) − − t t 3 t t+1 − t+1 where γ is the cost of delay in bargaining and the ωi’s are combinations of structural parameters. Both structures generated by these models satisfy different exclusion restrictions than the canonical model because they include backward- and forward- looking terms. As a result, it is easy to show that they will belong to different counterfactually equivalent sets with respect to changes in unemployment benefits policy rule. However, for instance, the seemingly richer model of a financial accelerator in Wasmer and Weil (2004) generates an identical structure to the canonical model. They assume that matching in a credit market between firms and creditors is subject to search frictions analogous to the ones in the labor market. Then, the presence of frictional credit market adds to the cost of posting vacancies because firms have to be matched to a creditor before they can enter the labor market. In equilibrium, as opposed to the canonical model, the value of a vacancy in the labor market is given by a positive constant K that depends on the search costs in the credit market and the matching probabilities of firms and creditors. Following the the dynamic extension derivation in Petrosky-Nadeau and Wasmer (2013), we obtain equilibrium equations,17

16The wage, vacancy-unemployment ratio, and benefits in log-deviations from the steady-state correspond to the first, second, and third columns of matrices F, G, H. Analogously, the exogenous policy and productivity shocks correspond to the first and second columns of matrices L, M. The firs line corresponds to (Wage Setting (1)) and the second line to (Job Creation (1)) 17They assume that firms and creditors Nash-bargain (together) with workers in the labor market and firms and creditors Nash-bargain with each other in the credit market.

S.20 1 φ φ 0 = w + y + (c + (1 β(1 q(ϑ )))K)ϑ + z + b (Wage Setting (4)) −1 φ t 1 φ t 1 φ − − t t t − − − 1 1 s 0 = (c + (1 β(1 q(ϑ )))K) + βE y + w + + − (c + (1 β(1 q(ϑ + )))K) − q(ϑ ) − − t t t 1 − t 1 q(ϑ ) − − t 1 t  t+1  (Job Creation (4))

Note that, compared to the canonical model, the effective cost to posting a vacancy is aug- mented by the additional term (1 β(1 q(θ )))K which encodes the search frictions in the − − t+1 credit market. However, it is easy to verify that this system of equations satisfies identical ex- clusion restrictions than the canonical model. Thus, if the financial accelerator model and the canonical model can match the equilibrium behavior of wages and labor market slackness, they are also counterfactually equivalent with respect to changes in the unemployment benefits policy rule. The same holds in a model where firms choose recruiting intensity e. In the spirit of Gavazza, Mongey, and Violante (2016), assume that the cost of posting a vacancy is a well-behaved function c(e, ϑ) and the probability of filling the vacancy is q(ϑe¯)e, where e¯ is the average recruiting intensity in the economy. Then, we obtain the following equilibrium equations when firms optimally choose identical recruiting intensities,

1 φ φ 0 = w + y + c (e(ϑ ), ϑ )ϑ + z (Wage Setting (5)) −1 φ t 1 φ t 1 φ e t t t t − − − ce(e(ϑt), ϑt) (1 s)ce(e(ϑt+1), ϑt+1) 0 = + βE y + w + + − (Job Creation (5)) − q(ϑ e(ϑ )) t t 1 − t 1 q(ϑ e(ϑ )) t t  t+1 t+1  Again, while this model behaves as if it had a matching and vacancy posting cost with extra curvature, it has an identical structure to the canonical model. Thus, I conclude that if the canon- ical model generates negligible effects of alternative unemployment benefits policy rules, then this quantitative result is robust to variation in primitives regarding certain forms of financial frictions and endogenous recruiting intensity.

References

Bachmann, Rüdiger and Eric R. Sims. 2012. “Confidence and the transmission of government spending shocks.” Journal of Monetary Economics 59 (3):235 – 249.

Baumeister, Christiane and James D Hamilton. 2015. “Sign restrictions, structural vector autore- gressions, and useful prior information.” Econometrica :1963–1999.

Bayoumi, Tamim and Paul R Masson. 1995. “Fiscal flows in the United States and Canada: lessons for monetary union in Europe.” European Economic Review 39 (2):253–274.

Beraja, Martin, Erik Hurst, and Juan Ospina. 2015. “The Aggregate Implications of Regional Business Cycles.” Unpublished Manuscript, University of Chicago .

Bernanke, Ben S, Mark Gertler, and Mark Watson. 1997. “Systematic monetary policy and the effects of oil price shocks.” Brookings papers on economic activity 1997 (1):91–157.

S.21 Chetty, Raj, Adam Guren, Day Manoli, and Andrea Weber. 2011. “Are micro and macro labor supply elasticities consistent? A review of evidence on the intensive and extensive margins.” The American Economic Review 101 (3):471–475.

Christiano, Lawrence J, Martin S Eichenbaum, and Mathias Trabandt. 2016. “Unemployment and business cycles.” Econometrica 84 (4):1523–1569.

Clarida, Richard, Jordi Gali, and Mark Gertler. 2000. “Monetary Policy Rules and Macroeconomic Stability: Evidence and Some Theory.” Quarterly Journal of Economics :147–180.

Feyrer, James and Bruce Sacerdote. 2013. “How Much Would US Style Fiscal Integration Buffer European Unemployment and Income Shocks?(A Comparative Empirical Analysis).” The American Economic Review 103 (3):125–128.

Gabaix, Xavier. 2020. “A behavioral New Keynesian model.” American Economic Review 110 (8):2271–2327.

Gavazza, Alessandro, Simon Mongey, and Giovanni L Violante. 2016. “Aggregate Recruiting Intensity.” Tech. rep., National Bureau of Economic Research.

Hall, Robert E. 2005. “Employment fluctuations with equilibrium wage stickiness.” The American Economic Review 95 (1):50–65.

Kilian, Lutz and Logan T Lewis. 2011. “Does the Fed respond to oil price shocks?” The Economic Journal 121 (555):1047–1072.

McKay, Alisdair, Emi Nakamura, and Jón Steinsson. 2017. “The discounted euler equation: A note.” Economica 84 (336):820–831.

Mian, Atif, Kamalesh Rao, Amir Sufi et al. 2013. “Household Balance Sheets, Consumption, and the Economic Slump.” The Quarterly Journal of Economics 128 (4):1687–1726.

Mian, Atif and Amir Sufi. 2014. “What Explains the 2007–2009 Drop in Employment?” Econo- metrica 82:2197–2223.

Mortensen, Dale T and Christopher A Pissarides. 1994. “Job creation and job destruction in the theory of unemployment.” The review of economic studies 61 (3):397–415.

Petrosky-Nadeau, Nicolas and Etienne Wasmer. 2013. “The cyclical volatility of labor markets under frictional financial markets.” American Economic Journal: Macroeconomics 5 (1):193–221.

Pissarides, Christopher A. 2000. Equilibrium unemployment theory. MIT press.

Ravenna, Federico. 2007. “Vector autoregressions and reduced form representations of DSGE models.” Journal of monetary economics 54 (7):2048–2064.

Ruggles, Steven, Matthew Sobek, Catherine A Fitch, Patricia Kelly Hall, and Chad Ronnander. 1997. Integrated public use microdata series: Version 2.0. Historical Census Projects, Department of History, University of Minnesota.

Schmitt-Grohé, Stephanie and Martın Uribe. 2003. “Closing small open economy models.” Journal of international Economics 61 (1):163–185.

S.22 Sims, Christopher A and Tao Zha. 2006. “Does monetary policy generate recessions?” Macroeco- nomic Dynamics 10 (2):231–272.

Wasmer, Etienne and Philippe Weil. 2004. “The macroeconomics of labor and credit market imperfections.” The American Economic Review 94 (4):944–963.

S.23