<<

' 378.752 D34 , W-94-46

Revised: 2 February 1994

SOME ECONOMETRIC IMPLICATIONS OF LEARIsalsIG

by

Balazs Horvath

International Monetary Fund 700 19th Street Washington, DC 20431 (202) 623-8529

and

Marc Nerlove

Department of Agricultural and Resource 2200 Symons Hall University of Maryland College Park, MD 20742-5535 (301) 405-1293 FAX (301) 314-9091 e-mail: [email protected] Wgr -

Helpful discussions with T.W. Anderson, Viktoria Dalko, Javier Gardeazabal, Christian GOlirierC)1;\. Nicholas Kiefer, Richard Kihlstrom, Wilhelm Krelle, George Mailath, Roberto Mariano. liru«, Mizrach, Marta Re..quiez, Rafael Rob, Christopher Sims, Stefano Siviero and Douglas Wil!om gratefully acknowledged. This paper draws from Horvath's dissertation [14]. We Unink "u anonymous referees for their suggestions.

This research was in part supported by a grant from the National Science Foundation !o University of Pennsylvania (SS 8921715).

Correspondence regarding this paper should be sent to at the above addr( 3775

ABSTRACT 41- V

This paper explores one way in which policy variables, often considered to be exogenous in econometric modeling, may not be. We present a simple discrete time model in which a single decision maker optimizes an objective function subject to a constraint involving an unknown parameter. In all other respects the decision maker is assumed to have complete information about his environment. As time passes he learns about the of the unknown parameter. Learning is modeled by Bayesian updating, which has optimal properties. We formalize learning and show that the time series of decision variables which results cannot be considered exogenous in any sense by an , outside econometrician utilizing the data and a model involving the decision and outcome variables. This result follows from the definitions of Engle, Hendry and Richard [6]. Loss of structural invariance of parameters in an econometric model describing a system involving a learning agent may occur as a result of learning, which, we argue, is not necessarily merely a transitory source of this phenomenon. Finally, we show the presence of learning is detectable by the outside econometrician using the data alone, essentially by means of a test for exogeneity. This leaves open the question of whether exogeneity fails because of learning or for other reasons. The nonstationarity in observed variables induced by learning is not generally of the unit-root type nor likely to cause problems with standard tests for exogeneity. "Knowledge is helpful if it helps to make the best decisions." J. Marschak [1953]

1. INTRODUCTION

Since the earliest formulations of macro econometric models, policy variables such as tax rates, treasury bill rates, agricultural support prices, and the like, have been treated as exogenous in the modeling of the behavior of agents which policymakers seek to modify or control. For example, in his masterful exposition of simultaneous equations problems and the desirability of structural estimation, Jacob Marschak [21] discusses a simple model of a government which sets tax rates in order to maximize an objective function depending upon tax revenues and output (pp. 10-11). Tax rates are considered as exogenous in the determination of output. Similarly, in the many macro econometric models surveyed in Nerlove [25], variables such as supply, required reserve ratios, taxes or tax rates, and other policy variables, or variables which the models' authors argued were policy variables, were treated as exogenous. This practice has continued up to the present day and is enshrined in econometric practice by such standard texts as Pindyck and Rubinfeld [28]. (See especially pp. 396-412.) Notwithstanding econometric practice, policy variables may fail to be exogenous for many reasons. In a democratic society, policymakers may respond to economic signals if these reflect economic determinants of political behavior. Another, possibly important, source for such failure is learning on the part of decision makers. Whatever the social welfare or function maximized by decision makers, learning about the underlying political and economic structure which contains policy choices may result in the phenomenon we investigate here. Learning, in a general sense, causes exogeneity to fail and may imply nonstationarity in the behavior of the economy observable by the "outside" econometrician. In this paper, we use a simple extension of Marschak's early expository model to show that the tenability of the .assumption that policy variables are exogenous may fail if the policymaker does not know the underlying structural parameters of the economy and tries to estimate these as he goes along. The illustrative model we employ is a finite-horizon version of the well-known "Laffer Curve." The model itself is obviously not sufficient to represent reality but serves well to make our general point that, in the presence of policymaker learning, the policy-variable(s) cannot be treated as exogenous by the outside econometrician observer. While this may seem a trivial point, our exploration within the context of a Bayesian learning framework leads to a number of unexpected insights with respect to what can or cannot be consistently and/or efficiently estimated and what relationships usually treated as structural may or may not be stable. Consider a policymaker whose social welfare function includes only expenditures on public and who must balance his budget. Thus he maximizes the discounted sum of tax revenues by determining, in each period, a tax rate to be applied to individuals in the economy. The constraint the policymaker faces is a tax-revenue function which is assumed to have a unique maximum between a tax rate of0% and 100% (sometimes called a Laffer curve). It is subject to some stochastic variation and the parameter determining the location of the maximum is not known with certainty. More generally, the problem is one of optimal control of a stochastic process with some unknown parameters. The agent controlling the process has beliefs about these parameters embodied in a prior probability distribution. He is assumed to refine his beliefs—or learn—as time goes on, in the sense of updating his prior distribution via Bayes' rule, using the information that emerges as the process proceeds. This information may also depend on the tax rate chosen. With more information, more efficient optimization is possible in subsequent periods. Therefore, the agent in any given period has to make an optimal trade-off between two competing goals: maximizing current payoff given his current level of information versus maximizing the expected information yield about the unknown parameters. This problem is studied in Easley and Kiefer [4], Pesaran [26], El-Gamal and Sundaran [5], Grossman, Kihistrom and Mirman [11], Kiefer [16], McLennan [23], Prescott [29], among others. We adopt a fmite-horizon objective function with no essential loss of generality. It is well known that with a Gaussian diffuse prior and a squared-error loss function Bayesian updating yields a result coinciding with that of ordinary least-squares learning. Therefore the model presented encompasses the case of ordinary least-squares learning. On the other hand, correct specification of the model is a maintained hypothesis not subjected to statistical test. This leaves the framework open to the criticism that the learning agent may adopt a false model. A related, but distinct problem is emphasized in McLennan [23]: even if the agent adopts the correct model, his beliefs will not necessarily converge to the truth. Another strand in the literature treats convergence, or failure to converge, of beliefs under Bayesian learning and of the implications for the existence of equilibria. Bossaerts [3] provides a very simple example offailure of equilibrium prices to converge to the rational

2 expectations price in a market in which agents trade a forward contract on an underlying asset, the distribution of whose price and prior beliefs on the mean are both according to standard assumptions. Beliefs differ and market participants trade only for this reason. Even if convergence obtains, Bossaerts shows that standard test statistics do not converge to their rational expectations equivalents. For example, the average prediction error converges to a random variable having nonzero mean and involving Brownian motion functionals. Transient disagreement may not be ignored in testing as a consequence. We too find that policymaker learning results in nonstationarity. Although in this simple model the consequences are transitory, in general, they are not. However, learning does not introduce a unit root in any of the observed series and does not, therefore, cause further difficulty with the usual exogeneity tests on which we propose to base detection of policymaker learning. We distinguish between open loop policy, passive and active learning. Open loop policy is based on a nonchanging information set: it encompasses both the full information case and the case when information is less than perfect but is not augmented, i.e., no learning occurs. Passive learning means incorporating any information that happens to be generated as a result of payoff maximizing behavior ignoring experimental design aspects of the problem. Active learning occurs when the policymaker optimally trades off current payoff for future information expected to be generated. The tax-revenue function can be obtained as a reduced form arising from the interaction of two distinct tax effects. The first is the effect of the tax on total output, the second is the effect of the tax on the amount of evasion. An advantage of this example to illustrate learning phenomena is that it avoids game-theoretic complications. In the usual setting of a firm exploring its demand curve, a small number of players must be assumed to produce active learning as the optimal strategy. In this case, "gaming" behavior is obvious. Grossman, Kihlstrom, and Mirman [11], and Kiefer [16] discuss a monopolist who experiments to best estimate the parameters of his demand curve. The alternative to current payoff is information, which is in general a , and may thus lead to a strategic game in which free-rider entrants must be taken into account. In our example, the goverment holds an uncontestable monopoly of taxation, so that the information generated by experimentation cannot be used directly by any other agent. Bossaerts [3] also avoids strategic effects in his example by restricting the existence of certain markets. However, active learning is generally optimal.

3

.. We establish the condition for the strict optimality of active learning which is shown to hold under general circumstances. However, extreme discounting, excessively restrictive specification of the objective function or of the belief distribution can make active learning no longer strictly optimal. Next, we consider the situation of an outside econometrician who has access to the data generated by the optimizing agent, but not to the agent's prior beliefs. He also knows the economy's response and is assumed to utilize a correctly specified model involving the policy variable as well as the outcome variable. Under these circumstances, it is clear that the outside econometrician cannot treat his observations on the policy variable as observations on an exogenous variable. We explore the consequences of learning for econometric practice in this setting. Moreover, learning is clearly related to the question of stability of rational expectations equilibria, since it is the mechanism by which agents acquire the necessary knowledge they are assumed to possess in a rational expectations equilibrium. The equilibrium implications of the rational expectations hypothesis for econometric practice has elicited considerable ; see, for example, Wallis [33], Hansen and Sargent [13], Pesaran [27, Chapter 6], Bossaerts [3]. We focus on out-of-equilibrium implications of the rational expectations hypothesis, in particular the impact of learning on the exogeneity of policy variables. Sargent and Wallace [30], Barro [2] and McCallum [22] discuss the circumstances under which the rational expectations hypothesis is compatible with an exogenous policy variable, that is, with an optimal policy rule that is not a feedback rule. Even then, however, it is shown that learning, the very phenomenon that underlies the stability of rational expectations equilibria, creates an informational feedback. Consequently, from the standpoint of the outside econometrician, policy variables will no longer by exogenous by any of the definitions suggested by Engle, Hendry and Richard [6]. We show that learning implies a loss of parameter invariance to changes in policy regimes analogous to the one addressed in the , albeit via a distinctly different channel. Finally, in section 4, we argue that learning is empirically detectable under plausible assumptions, i.e., that it is possible to verify on the basis of the observed data alone that learning has occurred in its generation.

2. A PARADIGMATIC MODEL: TAX RATE DETERMINATION 'WITH SIMULTANEOUS OPTIMIZATION AND LEARNING

Let the tax-revenue function be given by

4 R(T) = t Q(T) S(T) (1) R(0) = R(1) = 0, (2) where t is marginal tax rate assumed to be same for the whole economy, Q(i) is total output and S(T) is the evasion factor. Let Q(;)= Q {1 - it} and (3) S(T)= a - 13 -Et + ut, (4) where 13 is an unknown parameter, ut is an i.i.d. doubly symmetrically truncated random variable distributed as N(0, s2)and 0--5, a and s' are known (positive) constants.' The reason for the double truncation is that is ensures with probability arbitrarily close to 1 that the policymaker will not possess unreasonable beliefs at any point in time due to extreme sequences of realizations of the noise term. Without loss of generality, we assume Q = 1. The parameter 13 is the only unknown. We only consider the case p > 0. Note that ut has a probability distribution symmetric around zero. This formulation of the problem is thus an extension of that considered in Prescott [29]. Our specification imposes (2) independently of the evolution of beliefs. Thus, learning with respect to the unknown parameter can proceed in an unrestricted manner. Passive learning corresponds to an approach which treats periods separately. An actively learning agent, on the other hand, maximizes the total discounted sum of revenues, optimally trading off some of the obtainable current revenue for extra information generated. We first develop the problem of an actively learning agent, since passive learning is seen to be a special case. Formulate the policymaker's problem as a multiperiod problem with finite, known horizon T> 2. In period It the policymaker seeks to max E, E 43 , R(ii), (5) it,ri., 1=1 t, Eig.,)

where the discount factor 8 E [0, 1] is a known constant. Restriction to a finite time horizon T involves no essential loss of generality and is made here for expository simplicity. The policymaker chooses the tax rate tt for each period so as to maximize (5) given the information available to him in the current period. The information set contains the sufficient statistics of all the payoff-relevant parameters: the values of the known parameters and the current belief distribution about 13, updated via Bayes' rule utilizing all observations that have become available by the current period. The period

5 objective function itself is static, but time periods are connected, via the evolution of beliefs. The policymaker's optimization problem given by (5) may be written using (1), (3) and (4) as:

max E, E ot-' (1 -Pc) [a -13 'Et + (6) kr,., t T,

The parameter i3 is unknown to the policymaker, therefore the expectation above involves the current a prior distribution embodying the beliefs of the policymaker and the current expectation of the noise term ut. If, as we assume, ut is i.i.d., the expectation operators in (6) become Nu) and EP, superimposed, where the superscript indicates the distribution with respect to which the expectation is to be computed and the subscript indicates the information set on which the expectation is conditioned. Assume that the prior probability density function embodying beliefs about p is normal. Define precision as the reciprocal of variance: h =0 -2, and write N(m, h)for a normal distribution with mean m and variance cf. Let m = E(p)t denote the mean belief in the current period.' Denote the belief distribution in period t by p(13). Let the initial prior be pi(13) = N(mi, h1)with m1 > 0. Note that the assumption of normality for the belief distribution is in accord with the requirement that p E supp(p,(13)) for each t. To simplify, let et-i = Tt_i ut-i. (7) To ensure that et., can be treated as observable, it is assumed that the policymaker can precisely observe 1Z,..1, the revenue generated in the previous period. Then e = a - (8) - t) is readily computable. Assuming Bayesian updating we have the following update rules (Prescott [29]): ht ht-1 (9) i mt -1 ht. -1 + t -1 et -1 (10) 111 These recursions are operational, since they involve only observable quantities. It is simple to show that the policymaker's problem in any period is well defined if

> 3 T1 - 1 ; mt > 0 3 (11) t

6 This condition will not hold in each period however, if extreme sequences of realizations of the noise term are not ruled out. This is the reason for the double truncation of the support of tit. We proceed by defining the value function as the function that gives the maximized value of the objective function in each period:

Vi(pt) = max Et E Os Es Its (1 ts) [a —13 Ts + )TJ:.

Clearly, the period t value function is a function of current beliefs pt EP(P is defined as the space of probability distributions with finite moments). For our model let p N(mt, 1-4). Following Easley and Kiefer [4], rewrite the value function as Vt(pt) = max (Et R(Tt) + 6E1{V1 ( E[0. 1] .1 k ipt+ii}) (12)' where beliefs corresponding to the next period are assumed to be generated via Bayes' rule: Pt+1 = 1343t5 t, et)- The trade-off between current gain and future information is now apparent: both components of(12), current payoff and the future value of the problem are a function of the choice of T.„ directly or indirectly. The case of passive learning is much simpler: the objective function is just Etkr,) in each period, where beliefs are updated the same way, but the decisionmaker does not optimize the generation of information.

3. OPTEVIALITY OF ACTIVE LEARNING

Proposition: Experimentation (i.e., active learning) is optimal if the value function is convex in beliefs and strictly convex for some periods.

The possibility of recouping currently foregone payoffs in the future in expected value terms is a consequence of the convexity of the value function. Convexity, via Jensen's 'inequality,' implies the inequality: E1vt+1(pt,1)--vt,1(Etpt,1)=v0-1(p). (13)

7 The equality in (13) is an application of the martingale property for beliefs generated via Bayes' rule (Easley and Kiefer [4]). For active learning to be optimal, the inequality in (14) must hold strictly in at least one period, yielding EtV1+I(Pt+1) - V+1(p)> a (14) This difference is the measure of the expected gain to be had from learning actively: when it is strictly positive, expected reward given anticipated posterior beliefs exceeds certain reward given current beliefs, making active learning optimal. We argue that the assumption of Bayesian updating and a constant J3 together imply that the value function must be convex (not necessarily strictly) in beliefs. In our setting more information cannot reduce the attainable maximum, that is, more information cannot hurt.' It follows that the maximum nature of the value function implies the Lemma: Vt(p), p E Ts is convex, the analogue of Lemma B in Prescott [29]. 6 If there is relevant learnable information, a well- formulated, sufficiently general problem of this class of models inherently has potential for optimal active learning. Indeed, active learning is clearly optimal if certain assumptions are met about the probability distributions employed, the specification of the constraint, the value of the discount factor 6, the extent and variability of , and the length of the horizon in the problem. Note that under the very restrictive assumption that the value function is affine in beliefs in each future period (e.g. if future beliefs are represented only by future mean beliefs, this occurs as a consequence of the martingale property), active learning is not strictly optimal. However, active learning is strictly optimal if future beliefs are represented first and higher moments of the probability distribution embodying beliefs and the discount factor oe (0,1). One obvious way to guarantee the optimality of active learning is to relax the assumption of risk neutrality, i.e. provide an explicit role for risk aversion of the decisionmaker. Thus, if there is relevant learnable information, a well-formulated, sufficiently general problem of this class of models leads to a value function strictly convex in beliefs and hence inherently has potential for optimal active learning.

4. ECONOMETRIC IMPLICATIONS

8 There are a number of concepts of exogeneity according to which we can assess the policy variable of our model: Koopmans' [18] original concept of exogeneity and/or predeterminedness of a variable in a complete linear dynamic simultaneous equation econometric model with additive i.i.d. disturbances is that a variable z, in such a model is (strictly) exogenous if it is independent of all current, past and future disturbances in the model; it is predetermined if it is independent of all current and future disturbances. More precise definitions ofexogeneity have been introduced by Engle, Hendry and Richard [6]. Denote the parameters of interest by kv and observed variables = [y,' There are no explicit restrictions on what is included in the vector W. The joint density of the observations, which depends on a set of parameters X, can be factored as a product of a conditional and a marginal density: D(xt; 2\-) = D(yt A; Xi) D(A; 72), where k is assumed to be identified. Consider the following requirements:

(a) (Ai, X,2)E Al X A2, where Ai n A2 =0, i.e. this factorization "operates a cut"; (b) gki), i.e., parameters of interest can be uniquely determined from X1 alone: (c) y does not Granger cause z; (d) X1 is invariant with respect to changes in X2. If(a) and (b) hold, then z, is weakly exogenous for estimating y If(c) holds in addition, then zt is strongly exogenous for the estimation of y If(a), (b) and (d) hold, but not necessarily (c), then z, is super exogenous for (11. It is easy to see that the policy variable tt fails to be exogenous by any of these criteria, because (a) fails. However under some circumstances it may be considered predetermined. Factor the joint density associated with the model of the preceding section as:

D(Rt, -Ct., = D(Rt I 'Cc; D(Ct; A2). (15)

Let X.'1 = [a p Q s2]. The parameter of interest is p. In section 2, we assumed the policymaker to know the values of a, Q and s2. We assume that the outside econometrician also has access to this information. If not a restrictive information structure could easily violate the condition

9 for weak exogeneity of by violating requirement (b). Obviously with this assumption, requirement (b) is satisfied since 111 = t'A i, where the vector t' =[0 1 0 01. Parameters of the process generating are included in A. In our formulation, this process is driven by the maximization in (6) and depends on the current information set which in turn contains past R's and ts. The evolution of information at time t is summarized by (9) and (10). Clearly then, includes at least some elements of Xi. For example, beliefs (and hence, ts) depend on a and 13, as is evident from (8), (9) and (10). Thus the fact that the policymaker is learning implies an overlap

(i.e., a cross-restriction) between Xi and X2, so requirement (a) is violated.' Therefore; does not remain weakly exogenous for estimating f3 for the outside econometrician if the data were generated by a learning policymaker. This in turn implies that is also neither strongly nor super exogenous for p. Moreover, 13 could be more efficiently estimated if the generating process for were included in ajoint estimation procedure, since this procedure also involves fa Disregarding this information results in loss of efficiency in estimating 11 The dynamics of nontrivial learning also causes requirement (c) to fail for;. R, affects the subsequent expectation operators by contributing a nonzero increment of information to the information set. This is evident from (8),(9) and (10). The choice of TH in turn is a result of the decision rule involving the Et+1 operator which is conditional on the current information set. Thus IXT,t+i 1 t(), No; 4) D(Tt+1 I t(t); X2), so that Rt.., Granger causes it+1. The Lucas [20] critique is that super exogeneity of policy variables may fail because agents, in responding to regime changes, adjust their expectations and, hence, their perception ofthe constraints they face, resulting in changed optimizing behavior for them. Since aggregated optimal decisions by agents constitute the data used in an econometric model, such adjustment makes parameters included in (I/ dependent on regime changes for z. Mistakenly assuming that the aggregator function is not sensitive to policy regime change may have consequences potentially as devastating as ignoring the sensitivity of expectations to changes in the policy regimes (Geweke [9]). Learning constitutes a third, independent channel implying a Lucas-type loss of parameter invariance by generating a shift in the agent's perception of the constraint he faces: the information about the constraint is augmented in each period. The key fact is that the increment in information depends on the particular sequence of policy variables that has been applied. Thus, even in a non-game

10 situation, and assuming away the potential sensitivity of the aggregator function to regime changes, learning behavior can explain different observed behavior of economic agents for different data generating regimes. Assume that parameters of interest are the structural parameters of a model describing the behavior of the policymaker as a function of the policy variable. Because of Bayesian learning, mt (the policymakeris mean perception of the parameter p) depends on all past values oft. Assuming a diffuse prior, successive substitution into (9) and (10) yields Et - 1 1.1 rn - E1.1 Clearly therefore, if the law of motion for ti's, and hence their time path, were different, so would be rn, for each t. An analogous argument holds for higher moments. A different perception by the policymaker of the constraint he faces in general will make him behave differently. We then have the exact analogue of the Lucas mechanism: the function describing the behavior of the policymaker cannot have parameters independent of the law of motion for exogenous variables. The notion of strict exogeneity, in contrast to strong and super exogeneity, relies on the uncorrelatedness of current disturbances and current policy variables in the econometric model. If the information feedback is assumed to occur with a lag, then T., is contemporaneously uncorrelated with an i.i.d. disturbance term, hence, strict exogeneity need not be lost because of the presence of learning as described in the model of this paper. There is no contradiction of our earlier results, however, since such uncorrelatedness merely reflects the fact that consistent estimation ofthe parameter of interest may still be possible even though its efficient estimation is not. In our simple model the nonstationary character of the policy variable tapers off since learning is complete: the belief distribution converges to a point mass centered on the true parameter value. Even under these circumstances however, the decision period must not be shorter than the observation period for lagged information feedback to not appear as an instantaneous one in the data, which would imply a contemporaneous correlation between the disturbance term and the policy variable, and hence loss of strict exogeneity. Moreover, both; and the disturbance term are functions of t, and as functions of the same variable, necessarily cannot be assumed to be uncorrelated. Again, in the simple model employed, this feature is only temporary, since the correlation induced by learning vanishes as

11 the belief distribution collapses. Asymptotic properties are not our focus here. However, incomplete learning can be shown to lead to a failure of strict exogeneity even in the limit. The outside econometrician can, in principle, extend his model to include the informational feedback. If the econometric model encompassed the learning mechanism as well, estimating jointly the parameters characterizing learning and those ofthe econometric model proper is a possible strategy. A fundamental obstacle is the non-observability of crucial variables necessary to formulate an identifiable model for learning. Beliefs are unobservable, as is the degree of risk aversion, the precise information structure, the method of learning, computational constraints, the precise degree of rationality, the utility attached to acquiring information, etc. Any of these aspects can, in principle, be quantified to the extent necessary, but simultaneous inclusion is not feasible, in general. The appropriate notion of rationality for example depends on what the computational constraints are, and on what variables appear in the information set, which, in turn, is affected by limits on observability. These factors condition the method of learning (e.g., qualitative information cannot be incorporated the same way as quantitative information). Risk aversion may also affect the mode of learning, as shown in section 3. Finally, strategic interactions may result in Pareto-suboptimal equilibria, implying outcomes that are patently non-rational when viewed from a purely decision-making viewpoint.

5. PROPERTIES OF TIME SERIES ARISING IN THE PRESENCE OF LEARNING AND IHE DETECIABILITY OF LEARNING

Even in a stationary environment beliefs need not always converge to the true value of the unknown parameter. For example, the agent may be convinced by initial realizations of the outcome variable to apply noninformative controls forever (Kiefer [16], Feldman [7]). Alternatively, as Bossaerts [3] shows heterogeneity of beliefs in a multiple agent problem may lead to a nonrational equilibrium in the limit. Such nonconvergence phenomena open up the disconcerting possibility of more fundamental breakdowns than those discussed so far. A different set of controls may result in beliefs of the economic agent which converge to a limit completely different from the one to which they converged with the actual sample sequence giving rise to the value of the parameters estimated in the model. In

12 this case, substantially different parameter values would be implied. Policy experiments with the econometric model are invalidated by the presence of learning in this case even in the limit. As argued above, convergence of parameter estimates by a learning agent need not occur in the econometric sense even if the environment is stationary. If the environment is nonstationary, beliefs do not necessarily converge and learning does not necessarily recede. In summary, the effects of learning cannot be dismissed as merely transitory phenomena that are asymptotically irrelevant. The restrictions placed by learning on the policy variable paths are not strict because learning is compatible with a rich variety of policy variable profiles. Smooth, but possibly also abrupt changes in the time path of the policy variable can occur, as a result of learning by a rational agent. Bayesian learning induces a gradual (smooth) regime change for the exogenous variable since the marginal distribution for the process generating the policy variable tt changes only slowly. Abrupt changes of regimes for the exogenous variable can, however, result if, for example policy is determined using the stochastic extension of the idea underlying the so-called golden section search: given the known unimodality of the tax-revenue function, after two observations the policymaker can truncate the support of the belief distribution on ra Truncation of the support of the belief distribution in general implies a discontinuous change in the process generating the tt's.8 Of course this defines an abrupt change for the inside econometrician only. Therefore, a suitable definition of abruptness of change is required for the outside econometrician. The detectability of learning is an important issue. Can the outside econometrician, using observed data alone, determine if learning has played a role in the data generating process? This is important because if he cannot, the results obtained in this paper are of limited practical use. On the other hand, if it is found that the presence of learning is a testable proposition, overlooking the loss of exogeneity due to learning is a caveat that a careful applied econometrician can avoid. Were a test available to detect the presence of learning, it could be added to the arsenal of specification tests applied to proposed econometric models before they are estimated and put to use. The main emphasis of this paper is the analysis of the consequences of learning behavior on econometric exogeneity. It is natural therefore to propose exogeneity tests to uncover the presence of learning. Since exogeneity is a refutable hypothesis, our argument is that it is sufficient to test for the loss of exogeneity to determine whether learning has been present in the data generating mechanism if other sources of failure of exogeneity are ruled out. While this is a valid and logically correct

13 argument, there are two practical problems. First, given the results of this paper, it is obvious that one should test for weak exogeneity. However, as Geweke [8] notes, weak exogeneity does not generate statistically refutable hypotheses, since a model can always be constructed with parameters of interest chosen in such a way that any variable can end up being weakly exogenous. Second, we have argued that learning generates nonstationary time series, therefore proposed tests must be robust in the presence of nonstationarity. What then is the solution? As opposed to weak exogeneity, strict exogeneity is testable. Moreover, given the exact specification of the econometric model at hand, it is frequently possible to link weak exogeneity to strict exogeneity, usually in the form of parameter restrictions (example 3.2 in Engle, Hendry and Richard [6]). Then the joint hypothesis is that the linking assumption holds and the policy variable is wealdy exogenous is testable. The test to be applied turns out to be a Granger causality test (see Geweke [8]), which, under the assumption that the outside econometrician includes the correct variables in his model, coincides with the test ofthe Granger causality relationship involved in the definition of strong exogeneity of the policy variables. Hence, not only can the test be interpreted as an indirect (in the sense described above) test of weak exogeneity, but also a direct test of strong exogeneity. Rejection of weak exogeneity would of course in itself imply rejection of strong and super exogeneity. A more direct test for super exogeneity is a Chow test for invariance of the parameters of interest. Nonstationarity of the observed time series generated by a model involving learning is perhaps a more serious matter. The effects of nonstationarity of the unit-root or stochastic-trend type are nonnegligible. Phillips and Durlauf[27], Sims and Stock [31], as well as Bossaerts [3] show that unit- root nonstationarity invalidate the usual causality tests, which is what we propose for detecting the presence of learning. Other types of nonstationarity appear to be more benign (Hosoya [15]). The problem with the usual tests in the presence of unit roots is that the distribution of the test statistics is nonstandard if an I(1) variable is regressed on an 1(0) variable. Stock [32], however, refers to a transformation which transforms the equation to be estimated in such a way that any parameters of interest are expressed as coefficients ofzero -mean stationary variables, then for those parameters the usual F tests apply for that subset even in the presence of other I(1) regressors. If, in a specific case, it can be shown that learning induces 41) behavior, it makes sense to seek such a transformation.

14 But is it likely that I(.) behavior can be induced by learning? In our simple model, the tax rate is constrained to be between zero and one. Such a constraint effectively rules out a unit root in the tax rate itself since it would cause the rate to wander outside the [0, 1] interval with probability I. (For example, Nelson and Plosser [24] found unit roots in a variety of U.S. time series, but not in the rate which is constrained to the interval [0, 1].) Moreover, if the method of learning is Bayesian updating, and the truth is constant, then learning, which is incremental accumulation of information is not compatible with the constant random "rebasing" of future expectations which is characteristic of random walk processes. The increments in information are uncorrelated with existing knowledge in every period and thus with each other. If, however, learning does not converge, or the underlying relationship about which the decision- maker learns is not constant, it may well happen that learning induces I(1) behavior in an otherwise stationary series. For example, suppose that learning is a Kalman filter tracking an I(1) variable. In this case the resulting beliefs, and hence the control variable based on them, will have a unit root. Unit roots pose problems with causality testing. Learning may be a source of unit roots and, in this way interfere with our proposed test for detecting it. If, in a particular case, it can be shown that learning is a potential source of I(1) behavior, which indeed we believe will be unlikely in most cases of empirical interest, then either a transformation of the sort suggested by Stock [32] will have to be found or we need to modify the usual t- or F-test rejection regions. Finally, we note that learning typically does not lead to inconsistency when a policy variable is treated as if it were exogenous (because its value is typically based on past observations) but clearly does lead to inefficient estimation. We have not investigated more precisely the loss of efficiency which results, which thus remains as a topic for future investigation. In summary, given a correct specification of the econometric model and a condition linking weak to strict exogeneity, at least an indirect test for all the exogeneity concepts can be formulated. Given the results of this paper, these are also tests for the presence of learning if other sources of endogeneity are ruled out for the policy variable. However, one must be very careful to note that failure of exogeneity may occur for other reasons, whose absence is a maintained hypothesis in the proposed tests. Even if learning can be justified as the primary source of failure of exogeneity, the standard tests of causality, on which our proposal for the detection of learning is based, may fail because of

15 unit-root nonstationarity induced by the very learning behavior we want to detect. We argue that such 41) behavior is generally unlikely.

6. SUMMARY

This paper demonstrates that policy variables chosen by a learning policymaker cannot be considered exogenous for economic modeling purposes by any definition from the standpoint of an outside observer. Hence, econometric inference is problematic in a model which requires the assumption of exogeneity of policy variables chosen by a learning decisionmaker. Loss of parameter invariance under regime changes in the estimated econometric model is shown to occur in the presence of learning in the short run and a potential problem also in the long run. The presence of learning in the data generating process can be detectable precisely because of the implications of learning for exogeneity as discussed in the paper. However, since failure of exogeneity can arise for other reasons, these cannot be ruled out by such tests. Unfortunately the implications for econometric practice are discomfiting. If exogeneity fails, such failure may be due to learning. If it is, then data on prior beliefs and the assumption that beliefs converge to the truth are central to correct structural estimation. Yet such prior beliefs matter less and less as the sample size grows, which means that in the limiting case structural estimation is impossible. Kirman (1983) [17], and McLennan (1984) [23], for example, show that there are plausible cases in which such convergence does not occur. Has the faith that structural estimation is desirable, appropriate and necessary really been a misguided belief on the part of econometricians since the time of Haavelmo [12]? The implication of this paper is that quite possibly it has.

16 NOTES

1. The support of the random variable ut is (-K, K), rather than (-co, co), where K is a positive constant depending on the values of the parameters a, f3, s' in the model. Numerical simulations using various settings of the parameters reported in Horvath [14] showed that, for reasonable parameters values, K is always large, i.e., the restriction implied by the truncation is mild. 2. Our convention for subscripting is the following: mt is the mean belief about f3 at the time when -; is chosen, but before R, is observed. 3. This regularity condition involving the conditional expectation of the unknown parameter could be replaced by appropriate technical conditions on the support and variance ofthe noise variable implying an rn, sequence satisfying (11) in each period. In the case of active learning, the maximum is not necessarily unique. 4. Jensen's inequality is used for a function of a probability distribution. The validity of this step is also implicitly assumed by Easley and Kiefer ([4], section 5.ii, p. 1057). 5. The assumption that more information cannot hurt holds in our context. However, Arrow [1] contains an especially simple and intuitive counter example. Her relies on the fact that additional information may eliminate the possibility oftrading risks without doing any offsetting good. 6. A simple proof, pointed outto us by N. Kiefer, is as follows: Let

l(p, q) Ep E {returns following implementation of policy q}, where p is a probabilitY'Measure. Then

17 XV(p,) I —X) V(p2) = Xl(p,.qi) + H —X) 42.0

Al(p1,(1*) + —X) 1(p2.0 = ,+ (1 —?\)p2,(1*) = v(Asp, + (1 —X) p2)

= where q* is chosen to maximize 1(41 +(1 - 42, q). 7. More generally, any kind of feedback from observations of the endogenous variable to the function determining the choice of consecutive exogenous variables implies an overlap. 8. Suppose the policymaker knows for certain that the Laffer curve is stable and has a unique peak. Then, after 2 observations on tt and Rt, he can rule out a large interval for the parameter determining the location of the peak. For example, if R2 RI, then is >_ T1. For a discussion of the stochastic extension see Le Cam and Olshen [19].

18 REFERENCES

1. Arrow, K.J. Risk Allocation and Information: Some Recent Theoretical Developments, First Annual Lecture of the Geneva Convention, Association Internationale pour l'Etude de l'Economie de l'Assurance, Geneva, 1978.

2. Barro, B.J. Rational Expectations and the Role of , Journal of , 2 (1976): 1-32.

3. Bossaerts, P. Asset Prices in a Speculative Market, unpublished, December 1992.

4. Easley, D. and N.M. Kiefer. Controlling a Stochastic Process with Unknown Parameters, Econometrica, 56 (1988): 1045-1064.

5. El-Gamal, MA. and R.K. Sundaran, Bayesian ... Bayesian Agents I: An Alternative Approach to Optimal Learning, Social Science Working Paper 705, California Institute of Technology, 1989.

6. Engle, RF., D.F. Hendry and J.-F. Richard. Exogeneity, Econometrica, 51 (1983): 277-304.

7. Feldman, M Comment on Kiefer (1988-89), Econometric Reviews, 1 (1988-89): 149-154.

8. Geweke, J. Inference and Causality in Economic Time Series, Chapter 19 of Griliches, Z. and Intriligator [10].

9. Geweke, J. Macroeconometric Modeling and the Theory ofthe Representative Agent, American Economic Review, 75(1985): 206-210.

10. Griliches, Z. and M. Intriligator, eds. Handbook of , New York: North Holland, 1984.

11. Grossman, S.J., RE. Kihistrom and L.J. Mirman. A Bayesian Approach to the Production of Information and Learning by Doing, Review of Economic Studies, 44(1977): 533-547.

12. Haavelmo, T. The Probability Approach in Econometrics, Supplement to Econometrica, 12 (1944).

13. Hansen, LP. and T.J. Sargent. Formulating and Estimating Dynamic Linear Rational Expectations Models, Journal of no *c I amic and Control, 2 (1980): 7-46.

14. Horvath, B. Are Policy Variables Exogenous? Lecture Notes in Economics and Mathematical Systems, #364, Heidelberg: Springer Verlag, 1991.

15. Hosoya, Y. On the Granger Condition for Noncausality, Econometrica, 45 (1977): 1735-1736.

19 16. Kiefer, N.M. Optimal Collection of Information by Partially Informed Agents, Econometric Reviews, 7(1988-89): 133-148.

17. Kirman, A. On Mistaken Beliefs and Resultant Equilibrium, pp. 147-166 in R Frydman and E.S. Phelps, eds. Individual Forecasting and Aggregate Outcomes, Cambridge, MA: University Press.

18. Koopmans, T.C., ed. Statistical Inference in Dynamic Economic Models, Cowles Commission Monograph #10, New York: John Wiley, 1950.

19. Le Cam, L.M. and RA. Olshen, eds. Proceedings of the Berkeley Conference in Honor of J. Neyman and J. Kiefer, June 1983, Vol. 2, Monterey: Wadsworth Advanced Books, 1985.

20, Lucas, RE. Econometric Policy Evaluation: A Critique, in Brunner, K and A.H. Meltzer, eds. The Phillips_Curve and Labor Markets, 19-46, Carnegie-Rochester Conference Series #1, New York: North Holland, 1976.

21. Marschak, J. Economic Measurement for Policy and Prediction, pp. 1-26 in W.C. Hood and T.C. Koopmans, eds. Studies in Econometric Method. New York: John Wiley, 1953.

22. McCallum, B.T. The Current State of the Policy-Ineffectiveness Debate, American Economic Review, 62(1979): 240-245.

23. McLennan, A. Price Dispersion and Incomplete Learning in the Long Run, Journal of Economic Dynamics and Control, 1(1984): 331-347.

24. Nelson, C. R and C. I. Plosser: Trends and Random Walks in Macroeconomic Time Series, Journal of Economics, (1982): 129-162.

25. Nerlove, M. A Tabular Survey of Macro-Economic Models, International Economic Review, 2(1965): 127-175.

26. Pesaran, M.H. The Limits to Rational Expectations, New York: Basil Blackwell, 1987.

27. Phillips, P. C. B. and S. Durlauf. Trends versus Random Walks in Time Series Analysis, Econometrica, 56 (1988): 1333-1354.

28. Pindyck, RS., and D. L. Rubinfeld. Econometric Models and Economic Forecasts, Third Edition. New York: McGraw-Hill, 1991.

29. Prescott, E. C. Multiperiod Control Problem Under Uncertainty, Econometrica, 40 (1972): 1043-1058.

30. Sargent, T.J. and N. Wallace. "Rational" Expectations, the Optimal Monetary Instrument and the Optimal Rule, Journal of Political Economy, B. (1975): 241-254.

20 31. Sims, C. A. and J. H. Stock. Inference in Linear Time Series Models with some Unit Roots, Econometrica, 58 (1990): 113-144.

32. Stock, J. H. Recent Developments in the Analysis of Time Series Data: A Practical Survey, Lecture given at the IMF, March 16, 1993.

33. Wallis, K.F. Econometric Implications ofthe Rational Expectations Hypothesis, Econometrica, 48 (1980): 49-74.

PAPERFEB.MN/2-1-94

21