<<

Richard T. Ely Lecture

Beliefs, Doubts and Learning: Valuing Macroeconomic Risk

By Lars Peter Hansen*

This essay examines the problem of inference sophisticated investment decisions. Should we within a rational expectations model from two put econometricians and economic agents on perspectives: that of an econometrician and that comparable footing, or should we endow eco- of the economic agents within the model. The nomic agents with much more refined statistical assumption of rational expectations has been knowledge? and remains an important component to quan- From an econometric standpoint, the out- titative research. It endows economic decision come of the rational expectations approach is makers with knowledge of the probability law the availability of extra information about the implied by the economic model. As such, it is an underlying economic model. This information equilibrium concept. Imposing rational expecta- is reflected in an extensive set of cross-equation tions removed from consideration the need for restrictions. These restrictions allow an econo- separately specifying beliefs or subjective com- metrician to extract more precise information ponents of uncertainty. Thus, it simplified model about parameters or to refine the specification specification and implied an array of testable of exogenous processes for the model builder. implications that are different from those con- To understand the nature of these restrictions, sidered previously. It reframed policy analysis consider a dynamic model in which economic by questioning the effectiveness of policy levers agents must make investment decisions in physi- that induce outcomes that differ systematically cal, human, or financial capital. The decision to from individual beliefs. invest is forward-looking because an investment I consider two related problems. The first is made today has ramifications for the future the problem of an econometrician who follows capital stock. The forward-looking nature of John F. Muth (1961), Robert E. Lucas, Jr., and investment induces decision makers to make Edward C. Prescott (1971), Lucas (1972a), predictions or forecasts as part of their current Thomas J. Sargent (1973), and an extensive body period choice of investment. The forward-look- of research by adopting an assumption of rational ing perspective affects equilibrium outcomes expectations on the part of economic agents. In including market valuations of capital assets. implementing this approach, researchers abstract Rational expectations presumes from hard statistical questions that pertain to that agents know the probabilities determin- model specification and estimation. The second ing exogenous shocks as they formulate their problem is that of economic decision makers or choices. This translates to an extensive set of investors who must forecast the future to make cross-equation restrictions that can be exploited to aid identification and inference. * Department of Economics, , 1126 The cross-equation restrictions broadly con- E. 59th St., Chicago, IL 60637 (e-mail: l-hansen@uchicago. ceived are a powerful tool, but to what extent edu). The topics I cover in this paper have been influenced by should we as applied researchers rely on it? As an extensive collaboration with Thomas Sargent. I greatly applied time series econometricians, we routinely appreciate conversations with John Heaton, Ravi Jagannathan, confront challenging problems in model specifi- Monika Piazzesi, and Martin Schneider. I owe a special acknowledgement to Thomas Sargent and Grace Tsiang cation. How do we model stochastic dynamics in who provided many valuable comments on preliminary the short and long run? What variables are best drafts of this paper. Also I want to thank participants at forecasters? How do we select among competing workshops at New York University and the Federal Reserve models? Bank of Chicago. Junghoon Lee and Ricardo Mayer pro- vided expert research assistance. This material is based A heuristic defense for rational expectations upon work supported by the National Science Foundation appeals to a Law of Large Numbers and gives under award SES0519372. agents a wealth of data. This allows, at least as   AEA PAPERS AND PROCEEDINGS MAY 2007 an approximation, for us the model builders to • How is learning altered when decision mak- presume investor knowledge of a probability ers admit that the models are misspecified or model and its parameters. But statistical infer- simplified? ence, estimation, and learning can be difficult in practice. In actual decision making, we may By answering these questions, we will see be required to learn about moving targets, to how statistical ambiguity alters the predicted make parametric inferences, to compare model risk-return relation, and we will see when performance, or to gauge the importance of learning induces model uncertainty premia long-run components of uncertainty. As the that are large when macroeconomic growth is statistical problem that agents confront in our sluggish. model is made complex, rational expectations’ presumed confidence in their knowledge of the I. Rational Expectations and Econometrics probability specification becomes more tenu- ous. This leads me to ask: (a) how can we bur- The cross-equation restrictions are the novel den the investors with some of the specification component to rational expectations economet- problems that challenge the econometrician, rics. They are derived by assuming investor and (b) when would doing so have important knowledge of parameters and solving for equi- quantitative implications? I confront these ques- librium decision rules and prices. I consider tions formally by exploring tools that quantify two examples of such restrictions from the asset when learning problems are hard, by examining pricing literature, and review some estimation the Bayesian solution to such problems and by methods designed for estimating models subject speculating on alternative approaches. to such restrictions. One example is the equilib- In this essay, I use the literature that links mac- rium wealth-consumption ratio and the other is roeconomics and asset pricing as a laboratory for a depiction of risk prices. examining the role of expectations and learning. The linkage of and is a A. Cross-Equation Restrictions natural choice for study. Even with a rich array of security markets, the macroeconomic risks Consider an environment in which equilib- cannot be diversified away (averaged out across rium consumption evolves as: investors), and hence are reflected in equilib- rium asset prices. Exposure to such risks must be (1) ct11 2 ct 5 mc 1 a · t 1 scut11 rewarded by the marketplace. By studying asset pricing, we, as model builders, specify the for- zt11 5 Azt 1 szut11, ward-looking beliefs of investors and how they cope with risk and uncertainty. Prior to develop- where ct is the logarithm of consumption, {ut} ing asset pricing applications, we consider some is an i.i.d. sequence of normally distributed stylized statistical decision and inferential prob- random vectors with mean zero and covariance lems that turn out to be informative. matrix I, and { t} is the process used to forecast I ask five questions that are pertinent to mod- consumption growth rates. I take equation (1) as eling the linkages between asset pricing and the equilibrium law of motion for consumption. macroeconomics: Following David M. Kreps and Evan L. Porteus (1978) and Larry G. Epstein and Stanley • When is estimation difficult? E. Zin (1989), I use a model of investor prefer- ences in which the intertemporal composition • What are the consequences for the econo- of risk matters. I will have more to say about metrician? such preferences subsequently. As emphasized by Epstein and Zin (1989), such preferences give • What are the consequence for economic a convenient way to separate risk and intertem- agents and for equilibrium outcomes? poral substitution. John Y. Campbell (1996) and others have used log linear models with such • What are the real time consequences of investor preferences to study cross-sectional learning? returns. VOL. 97 NO. 2 Richard T. Ely Lecture 

Wealth-Consumption Ratio.—Let r be the Shadow Risk Prices.—Assume a unitary inverse of the intertemporal elasticity of substi- elasticity of substitution and a recursive utility­ tution and b be the subjective discount factor. risk parameter g and a discount factor b and Approximate (around r 5 1): the same consumption dynamics. Consider the price of the one-period exposure to the shock vector . Following the convention in (2) w 2 c < 2log11 2 b2 ut11 t t finance, let the price be quoted in terms of the mean reward for being exposed to uncertainty. 1 2 r ba 2 b 21 1 m , 11 2 3 r1I A2 zt v 4 For Kreps and Porteus (1978) preferences, the intertemporal composition of risk matters, and where wt is log wealth. The constant term mv as a consequence the consumption dynamics are includes a risk adjustment. A key part of this reflected in the equilibrium prices, including the relation is the solution to a prediction problem: one-period risk prices. This linkage has been a focal point of work by Ravi Bansal and Amir ` Yaron (2004) and others. Specifically, the one- E bj 1 2 2 m 2 0 period price vector is c a ct1j ct1j21 c zt d j51 21 p 5 sc 1 3b1g 2 12ar1I 2 bA2 sz 4. 21 5 ba9(I 2 bA) zt. Later, I will add more detail about the construc- tion of such prices. For now, I simply observe Formula (2) uses the fact that preferences I con- that while this price vector is independent of the sider are represented recursively with an aggre- state vector zt, it depends on the vectors sc and sz gator that is homogeneous of degree one. As a along with the A matrix. Again, we have cross- consequence, Euler’s theorem gives a simple equation restrictions, but now the coefficients relation between the shadow value of the con- that govern variability also come into play. sumption process and the continuation value Pricing a claim to the next period shock is for that process. This shadow value includes the only one of many prices needed to price a cash corresponding risk adjustments. The intertem- flow or a hypothetical claim to future consump- poral budget constraint says that wealth should tion. Indeed, risk prices can be computed for equal the value of the consumption process. all horizons. Moreover, as shown by Hansen, The formula follows by taking a derivative with John C. Heaton, and Nan Li (2006) for log lin- respect to r. ear models like this one, and more generally by The restriction across equations (1) and (2) is Hansen and Jose Scheinkman (2006), the limit exemplary of the type of restrictions that typi- prices are also well defined. In this example the cally occur in linear rational expectations mod- limit price is els. The matrix A that governs the dynamics of 21 the {zt} process also shows up in the formula 3sc 1 a91I 2 A2 sz4 for the wealth-consumption ratio, and this is the 21 cross-equation restriction. Very similar formu- 1 3b 1g 2 12 a91I 2 bA2 sz4. las emerge in models of money demand (Rusdu Saracoglu and Sargent 1978), quadratic adjust- Cross-equation restrictions again link the con- ment cost models (Hansen and Sargent 1980) sumption dynamics and the risk prices. and in log-linear approximations of present- For these asset pricing calculations and for value models (Campbell and Robert J. Shiller some that follow, it is pedagogically easiest to 1988). view (1) as the outcome of an endowment econ- omy as in Lucas (1978). There is a simple produc- tion interpretation, however. Consider a so-called Ak production economy where output  See Hansen, et al. (forthcoming) for a derivation and is a linear function of capital and a technology see Campbell and Shiller (1988) and Restoy and Weil (1998) shock. Since consumers have unitary elasticity for closely related log-linear approximations. of intertemporal substitution (logarithmic utility  AEA PAPERS AND PROCEEDINGS MAY 2007 period utility function), it is well-known that the by the econometrician. Rewrite the representa- wealth-consumption ratio should be constant. tion of the wealth-consumption ratio as: The first-difference in consumption reveals the logarithm of the technology shock. The process { t} is a predictor of the growth rate in the tech- wt 2 ct L 2log(1 2 b) 1 mc 1 (1 2 r) nology. Of course this is a special outcome of this model, driven in part by the unitary elastic- ` ity assumption. The setup abstracts from issues 3 qE c bj 1 2 2 m 2 0H d 1 m r a ct1j ct1j21 c t v related to labor supply, adjustment costs, and j51 other potentially important macroeconomic ingredients, but it gives pedagogical simplicity 1 et. that we will put to good use. In summary, under the simple production-economy interpretation, The “error” term et captures omitted informa- our exogenous specification of a consumption- tion. Given that the econometrician solves the endowment process becomes a statement about prediction problem correctly based on his more the technology shock process. limited information set, the term et satisfies In computing the equilibrium outcomes in both examples, I have appealed to rational E 3et Z Ht4 5 0, expectations by endowing agents with knowl- edge of parameters. A rational expectations and this property implies orthogonality condi- econometrician imposes this knowledge on the tions that are exploitable in econometric esti- part of agents when constructing likelihood mation. Econometric relations often have other functions, but necessarily confronts statistical unobservable components or measurement errors uncertainty when conducting empirical investi- that give additional components to an error term. gations. Economic agents have a precision that Alternative econometric methods were devel- is absent for the econometrician. Whether this oped for handling estimation in which informa- distinction is important will depend on applica- tion available to economic agents is omitted by tion, but I will suggest some ways to explore to an econometrician (see Shiller 1972; Hansen and assess this. Prior to considering such questions, Sargent 1980; Hansen 1982; Robert E. Cumby, I describe some previous econometric develop- John Huizinga, and Maurice Obstfeld 1983; and ments that gave economic agents more informa- Funio Hayashi and Christopher A. Sims 1983). tion in addition to knowledge of parameters that A reduced-information counterpart to the ratio- generate underlying stochastic processes. nal expectations cross-equation restrictions is present in such estimation. B. Econometrics and Limited Information When the only source of an “error term” is omitted information, then there is another pos- Initial contributions to rational expecta- sible approach. The wealth-consumption ratio tions econometrics devised methods that per- may be used to reveal to the econometrician mitted economic agents to observe more data an additional component of the information that an econometrician used in an empirical available to economic agents (see, for example, investigation. To understand how such meth- Hansen and Sargent 1991; and Hansen, Robards, ods work, consider again the implied model and Sargent 1991). This is the econometricians’ of the wealth consumption ratio, and ask what counterpart to the literature on rational expecta- happens if the econometrician omits infor- tions with private information in which prices mation by omitting components of t. Let Ht reveal information to economic agents. denote the history up to date t of data used There is related literature on estimating and testing asset pricing restrictions. Asset pric- ing implications are often represented con- veniently as conditional moment restrictions  Thomas D. Tallarini, Jr. (2000) considers a production counterpart with labor supply, but without the extra depen- where the conditioning information set is that dence in the growth rate of technology shock, and without of economic agents. By applying the Law of adjustment costs. Iterated Expectations, an econometrician can, VOL. 97 NO. 2 Richard T. Ely Lecture  in effect, use a potentially smaller information ­decision maker can distinguish between compet- set in empirical investigation (see Hansen and ing models more easily. Rational expectations Kenneth J. Singleton 1982; Hansen and Scott F. as an approximation conceive of a limit that is Richard 1987; and others). used to justify private agents’ commitment to All these methods exploit the potential infor- one model. When is this a good approximation? mation advantage of investors in deducing A statistical investigation initiated by Herman testable restrictions. The methods work if the Chernoff (1952) gives a way to measure how close information that is omitted can be averaged out probability models are, one to another. It quanti- over time. These methods lose their reliability, fies when statistical discrimination is hard, and however, when omitted information has a very what in particular makes learning challenging. low frequency or time invariant component, as Suppose there is a large dataset available that in the case of infrequent regime shifts. is used prior to a decision to commit to one of While this literature aimed at giving eco- two models, say model a or model b. Consider nomic agents more information than an econo- an idealized or simplified decision problem in metrician along with knowledge of parameters, which one of these models is fully embraced, in what follows, I will explore ways to remove given this historical record, without challenge. some of this disparity, and I will illustrate some By a model, I mean a full probabilistic specifi- tools from statistics that are valuable in quanti- cation of a vector of observations Y. Each model fying when model selection is difficult. provides an alternative probability specification for the data. Thus, a model implies a likelihood II. Statistical Precision function, whose logarithms we denote by /(Y Z m 5 a) and /(Y Z m 5 b), respectively, where m Statistical inference is at the core of decision is used to denote the model. The difference in making under uncertainty. According to statisti- these log-likelihoods summarizes the statistical cal decision theory, enlightened choices are those information that is available to tell one model based on the data that have been observed. When from another given data, but more information imposing rational expectations, a researcher is required to determine the threshold for such a must decide with what prior information to decision. For instance, Bayesian and mini-max endow the decision maker. This specification model selection lead us to a decision rule of the could have trivial consequences, or it could have form: choose model a if consequences of central interest. In this section, I consider a measure of statistical closeness that /(Y Z m 5 a) 2 /(Y Z m 5 b) $ d, will be used later in this paper. This measure helps quantify statistical challenges for econo- where d is some threshold value. What deter- metricians as well as economic agents. mines the threshold value d? Two things: the Suppose there is some initial uncertainty about losses associated with selecting the wrong model the model. This could come from two sources: and the prior probabilities. Under symmetric the econometrician not knowing the model losses and equal prior probabilities for each (this is a well-known phenomenon in rational model, the threshold c is zero. Under symmetric expectations econometrics) or the agents them- losses, the mini-max solution is to choose d so selves not knowing it. Past observations should that the probability of making a mistake when be informative in model selection for either the model a is true is the same as the probability of econometrician or economic agent. Bayesian making a mistake when model b is true. Other decision theory offers a tractable way to pro- choices of loss functions or priors result in other ceed. It gives us an excellent benchmark and choices of d. As samples becomes more infor- starting point for understanding when learning mative, the mistake probabilities converge to problems are hard. zero either under nondegenerate Bayesian priors In a Markov setting, a decision maker observes or under the mini-max solution. states or signals, conditioning actions on these Limiting arguments can be informative. After observations. Models are statistically close if all, rational expectations is itself motivated by a they are hard to tell apart given an observed limiting calculation, the limit of an infinite num- history. With a richer history, i.e., more data, a ber of past observations in which the unknown  AEA PAPERS AND PROCEEDINGS MAY 2007 model is fully revealed. Chernoff’s method sug- criterion leads us to compute the difference in gests a refinement of this by asking what hap- the log likelihood: pens to mistake probabilities as the sample size of signals increases. Chernoff (1952) studies 1 T this question when the data generation is i.i.d., 2 1x 2 m 29S211x 2 m 2 a t a t a but there are extensions designed to accommo- 2 t51 date temporal dependence in Markov environ- 1 T ments (see for example Charles M. Newman and 1 1x 2 m 29S211x 2 m 2 a t b t b W. Barton Stuck 1979). Interestingly, the mis- 2 t51 take probabilities eventually decay at a common T geometric rate. The decay rate is independent of 5 2 1x 29S211m 2 m 2 a t b a the precise choice of priors, and it is the same t51 for the mini-max solution. I call this rate the  T 21 T 21 Chernoff rate and denote it by r. 1 1mb29S mb 2 1ma29S ma. In an i.i.d. environment, Chernoff’s analysis 2 2 leads to the study of the following entity. Both densities are absolutely continuous with respect Notice that the random variable in the second to a measure h. This absolute continuity is per- equality is normally distributed under each tinent so that we may form likelihood functions model. Under model a, the distribution is nor- that can be compared. The Chernoff rate for mal with mean: i.i.d. data is: T 21 3221ma29S 1mb 2 ma2 r 5 2log sup E(exp [a/(Yi Z m 5 b) 2 0#a#1 21 21 1 1mb29S mb 2 1ma29S ma4 2 a/(Yi Z m 5 a)] Z m 5 a).

T 21 This formula is symmetric in the role of the 5 1ma 2 mb29S 1ma 2 mb2 2 models, as can be verified by interchanging the roles of the two models throughout and and variance equal to twice this number. Under by replacing a by 1 2 a. The Chernoff rate is model b, the mean is the negative of this quan- justified by constructing convenient bounds of tity and the variance remains the same. Thus, indicator functions with exponential functions. the detection error probabilities are represent- Chernoff’s (1952) elegant analysis helped to ini- able as probabilities that normally distributed tiate an applied mathematics literature on the random variables exceeds a threshold. theory of large deviations. The following example is simple but reveal- In this simple example, the Chernoff rate is ing, nevertheless. 1 21 r 5 31ma 2 mb29S 1ma 2 mb24. Example 1: Suppose that xt is i.i.d. nor- 8 mal. Under model a, the mean is ma and under model b, the model is mb. For both models the This can be inferred directly from properties of covariance matrix is S. In addition, suppose the cumulative normal distribution, although the that model a is selected over model b if the log- Chernoff (1952) analysis is much more generally ­likelihood exceeds a threshold. This selection applicable. The logarithm of the average proba- bility of making a mistake converges to zero at a rate r given by this formula. This representation captures, in a formal sense, the simple idea that  It is often called Chernoff entropy in the statistics when the population means are close together, literature. they are very hard to distinguish statistically. In  While it is the use of relative likelihood functions that links this optimal statistical decision theory, Chernoff this case, the resulting model classification error (1952) also explores discrimination based on other ad hoc probabilities converge to zero very slowly and, statistics. conversely, when the means are far apart. VOL. 97 NO. 2 Richard T. Ely Lecture 

           

 

NJTUBLFQSPCBCJMJUZ    

 MPHBSJUINPGNJTUBLFQSPCBCJMJUZ               TBNQMFTJ[F TBNQMFTJ[F

Figure 1. Mistake Probabilities Figure 2. Logarithm of Mistake Probabilities

Notes: This figure displays the probability of making a mis- Note: This figure displays the logarithm of the probabil- take as a function of sample size when choosing between ity of making a mistake as a function of sample size when the predictable consumption growth rate model and the choosing between the predictable consumption growth i.i.d. model for consumption growth. The probabilities model and the i.i.d. model for consumption growth. This assume a prior probability of one-half for each model. curve is the logarithm of the curve in Figure 1. The mistake probabilities are essentially the same if mini- max approach is used in which the thresholds are chosen to equate the model-dependent mistake probabilities. The curve was computed using Monte Carlo simulation. For the predictabale consumption growth model, the state {zt} is Are the models a and b easy to distinguish? unobservable and initialized in its stochastic steady state. The mistake probabilities and their logarithms For the i.i.d. model, the prior mean for mc is 0.0056 and the are given in Figures 1 and 2. These figures prior standard deviation is 0.0014. quantify the notion that the two models are close using an extension to Chernoff’s (1952) calculations. For both models the statistician is While the simplicity of this example is reveal- presumed not to know the population mean, and ing, the absence of temporal dependence and for model a the statistician does not know the nonlinearity is limiting. I will explore a dynamic hidden state. All other parameters are known, specification next. arguably simplifying the task of a decision maker. Data on consumption growth rates are Example 2: Following Hansen and Sargent used when attempting to tell the models apart. (2006), consider two models of consumption: From Figure 1, we see that even with a sample one with a long-run risk component and one size of 100 (say 25 years) there is more than a 20 without. Model a is a special case of the con- percent chance of making a mistake. Increasing sumption dynamics given in (1) and is motivated the sample size to 200 reduces the probability by the analysis in Bansal and Yaron (2004): to about 10 percent. By sample size 500, a deci- sion maker can confidently determine the cor- (3) ct11 2 ct 5 0.0056 1 zt 1 0.0054u1, t11, rect model. Taking logarithms, in Figure 2, the growth rate analyzed by Chernoff (1952) and zt11 5 0.98zt 1 0.00047u2, t11, Newman and Stuck (1979) becomes evident. After an initial period or more rapid learn- and model b has the same form, but with zt 5 ing, the logarithm of the probabilities decays 0 implying that consumption growth rates are approximately linearly. The limiting slope is i.i.d. the Chernoff rate. This is an example in which

 The mean growth rate 0.0056 is the sample mean for u1, t11 is the sample standard deviation. In some of my calcu- postwar consumption growth, and coefficient on 0.0054 on lations, using continuous-time approximations, simplicity  AEA PAPERS AND PROCEEDINGS MAY 2007 model selection is difficult for an econometri- I price risks that are lognormal and constructed cian, and it is arguably problematic to assume as a function of this random vector: that investors inside a rational expectations model solved it ex ante. 1 Arguably, sophisticated investors know more exp Qv · m 1 v9Lu 2 v9SvR and process more information. Perhaps this is 2 sufficient for confidence to emerge. There may be other information or other past signals used for alternative choices of the n-dimensional vec- by economic agents in their decision making. tor v. The quadratic form in v is subtracted so Our simplistic one-signal model may dramati- that this risk has a mean with a logarithm given cally understate prior information. To the con- by v · m. trary, however, the available history may be Let exp(r f ) be the risk-free return. The loga- limited. For instance, endowing investors with rithm of the prices can often be represented as full confidence in model a applied to postwar data could be misguided, given that the previous log P1v2 5 v · m 2 r f 2 v9L p era was characterized by higher consumption volatility, two world wars, and a depression. for some n-dimensional vector p, where the vec- tor p contains what are typically called the risk III. Risk Prices and Statistical Ambiguity prices. Suppose that the matrix L is restricted so that In this section, I will show that there is an whenever v is a coordinate vector, a vector with intriguing link between the statistical detec- zeros, except for one entry which instead con- tion problem we have just described and what tains a one, the risk has a unit price P(v) or a is known as a risk price vector in the finance lit- zero logarithm of a price. Such an asset payoff is erature. First, I elaborate on the notion of a risk a gross return. Moreover, the payoff associated price vector by borrowing some familiar results, with any choice of v with coordinates that sum and then I develop a link between the Chernoff to one, i.e., v · 1n 5 1, is also a gross return, and rate from statistics and the maximal Sharpe hence has a price with logarithms that is zero. ratio. With this link, I quantify sensitivity of the Thus, in logarithms, the excess return over the measured trade-off between risk and return to risk-free return is small statistical changes in the inputs. v · m 2 r f 5 v9L p A. A Digression on Risk Prices for any v such that v · 1n 5 1. The vector p Risk prices are the compensation for a given prices the exposure to shock u and is the risk risk exposure. They are expressed conveniently price vector. It gives the compensation for risk as the required mean rewards for confronting exposure on the part of investors in terms of the risk. Such prices are the core ingredients logarithms of means. in the construction of mean-standard deviation Such formulas generalize to continuous time frontiers and are valuable for summarizing asset with Brownian motion risk. The risk pricing implications. prices given in Section I have this form, where u Consider an n-dimensional random vector is a shock vector at a future date. While the risk of the form: m 1 Lu, where u is a normally prices in that example are constant over time, distributed random vector with mean zero and in Section VII, I will give examples where they covariance matrix I. The matrix L determines vary over time. the risk exposure to be priced. This random vec- tor has mean m and covariance matrix S 5 LL9. is achieved by assuming a common value for this coeffi- cient for models with and without consumption predictabil- parameters and the autoregressive parameter for {zt}. The ity. The parameter value 0.0047 is the mode of a very flat data and the likelihood function construction are the same likelihood function constructed by fixing the two volatility as in Hansen and Sargent (2006). VOL. 97 NO. 2 Richard T. Ely Lecture 

B. Sharpe Ratios to investors. Both perspectives are of interest. I now suggest an approach and answer to the ques- The familiar Sharpe ratio (William F. Sharpe tion: Can a small amount of statistical ambigu- 1964) is the ratio of an excess return to its vola- ity explain part of the asset pricing anomalies? tility. I consider the logarithm counterpart and Part of what might be attributed to a large risk maximize by choice of v: price p is perhaps small statistical change in the underlying probability model. fˆ Suppose statistical ambiguity leads us to con- v # m 2 r vrLp max 5 max sider an alternative mean m*. The change m* 2 v · 1n51 "vrSv v · 1n51 "vrSv m alters the mean-standard deviation trade-off. Substitute this change into the maximal Sharpe 5 Z pZ ratio:

f 21 f 1/2 f 21 f 1/2 (m* 2 m 1 m 2 1 r )9S (m* 2 m 1 m 2 1 r ) . 5 C(m 2 r )9S (m 2 r )D . C n n D Using the Triangle Inequality, The solution measures how steep the risk-return trade-off is, but it also reveals how large the C(m* 2 m)9S21(m* 2 m)D 1/2 price vector p should be. A steeper slope of f 21 f 1/2 the mean-standard deviation frontier for asset 2 C(m 2 1nr )9S (m 2 1nr )D returns imposes a sharper lower bound on Z pZ. f 21 f 1/2 Both risk prices and maximal Sharpe ratios # C(m* 2 1nr )9S (m* 2 1nr )D . are of interest as diagnostics for asset pricing models. Risk prices give a direct implication This inequality shows that if when they can be measured accurately, but a weaker challenge is to compare Z pZ from a model (4) 1m* 2 m29S211m* 2 m2 to the empirical solution to (4) for a limited number of assets used in an empirical analysis. is sizable and offsets the initial Sharpe ratio, Omitting assets will still give a lower bound on then there is a sizable movement in the Sharpe Z pZ. Moreover, there are direct extensions that ratio. do not require the existence of a risk-free rate More can be said if I give myself the flexibility and are not premised on log-normality (e.g., see to choose the direction of the change. Suppose Shiller 1982, and Hansen and Ravi Jagannathan that I maximize the new Sharpe ratio by choice 1991). Omitting conditioning information has a of m*, subject to a constraint on (4). With this well-known distortion characterized by Hansen optimization, the magnitude of the constraint and Richard (1987). gives the movement in the Sharpe ratio. Chernoff’s formula tells us when (4) can be C. Statistical Ambiguity economically meaningful but statistically small. Squaring (4) and dividing by eight gives the Even if all pertinent risks can be measured by Chernoff rate. This gives a formal link between an econometrician, the mean m is not revealed the statistical discrimination of alternative mod- perfectly to an econometrician or perhaps even els and what are referred to as risk prices. The link between the Chernoff rate and the maximal Sharpe ratio gives an easily quantifiable role  Much has been made of the in for statistical ambiguity either on the part of an macroeconomics including, in particular, Rajnish Mehra and Prescott (1985). For our purposes it is better to explore econometrician or on the part of investors in the a more flexible characterization of return heterogeneity interpretation of the risk-return trade-off. as described here. Particular assets with “special” returns Could the maximal Sharpe ratio be equiva- can be easily omitted from an empirical analysis. While lent to placing alternative models on the table Treasury bills may contain an additional liquidity premia, because of their role as close cash substitutes, an econome- that are hard to discriminate statistically? trician can compute the maximal Sharpe ratio from other Maybe it is too much to ask to have models of equity returns and alternative risk-free benchmarks. risk premia that assume investor knowledge 10 AEA PAPERS AND PROCEEDINGS MAY 2007 of parameters bear the full brunt of explain- IV. Statistical Challenges ing large Sharpe ratios. Statistical uncertainty might well account for a substantial portion of In this section, I revisit model a (see equation this ratio. (3)) of Example 2 from two perspectives. I con- Consider a Chernoff rate of 1 percent per sider results, first, from the vantage point of an annum or 0.25 percent per quarter. Multiply econometrician and, second, from that of inves- by eight and take the square root. This gives tors in an equilibrium valuation model. an increase of about 0.14 in the maximum Sharpe ratio. Alternatively, a Chernoff rate A. The Struggling Econometrician of 0.5 percent per annum gives an increase of 0.1 in the maximum Sharpe ratio. These are An econometrician uses postwar data to esti- sizeable movements in the quarterly Sharpe mate parameters that are imputed to investors. I ratio, accounting for somewhere between present the statistical evidence available to the one-third and one-half of typical empirical econometrician in estimating the model. I con- measurements. struct posterior distributions from alternative There are two alternative perspectives on this priors and focus on two parameters in particu- link. First is measurement uncertainty faced by lar: the autoregressive parameter for the state an econometrician, even when economic agents variable process {zt} and the mean growth rate know the relevant parameters. For instance, the in consumption. For simplicity and to antici- risk price model may be correct, but the econo- pate some of the calculations that follow, I fixed metrician has imperfect measurements. While the coefficient on u1, t11. I report priors that are the Chernoff calculation is suggestive, there not informative (loose priors) and priors that are well-known ways to account for statistical are informative (tight priors). It turns out that sampling errors for Sharpe ratios in more flex- there is very little sample information about ible ways, including, for example, Michael R. the coefficient on u2, t11. As a consequence, I Gibbons, Stephen A. Ross, and Jay Shanken used an informative prior for this coefficient in (1989). Alternatively, investors may face this generating the “loose prior” results, and I fixed ambiguity, which may alter the predicted value this coefficient at 0.00047 when generating the of p and hence Z pZ coming from the economic “tight prior” results. model. I will have more to say about this in the I depict the priors and posteriors in Figure 3. next section. There is very weak sample information about the The particular formula for the Chernoff rate auto regressive parameter, and priors are poten- was produced under very special assumptions, tially important. There is some evidence favor- much too special for more serious quantitative ing coefficients close to unity. Under our rational work. Means and variances are dependent on expectations solutions we took the parameter to conditioning information. Normal distribu- be 0.98, in large part because of our interest in tions may be a poor approximations. Evan W. a model with a low frequency component. The Anderson, Hansen, and Sargent (2003) build posterior distribution for the mean for consump- on the work of Newman and Stuck (1979) to tion growth is less sensitive to priors. Without develop this link more fully. Under more gen- exploiting cross-equation restrictions, there is eral circumstances, a distinction must be made only very weak statistical evidence about the pro- between local discrimination rates and global cess {zt} which is hidden from the econometrician. discrimination rates. In continuous time mod- Imposing the cross-equation restrictions begs the els with a Brownian motion information struc- question of where investors come up with knowl- ture, the local discrimination rate has the same edge of the parameters that govern this process. representation based on normal distributions with common covariances, but this rate can be state dependent. Thus, the link between Sharpe  Without this focus one might want to examine other ratios and the local Chernoff rate applies to an aspects of consumption dynamics for which a richer model could be employed. Hansen, Heaton, and Li (2006) use cor- important class of asset pricing models. The porate earnings as a predictor variable and document a low limiting decay rate is a global rate that aver- frequency component using a vector autoregression, pro- ages the local rate in a particular sense. vided that a cointegration restriction is imposed. VOL. 97 NO. 2 Richard T. Ely Lecture 11

 

  

 

  

            Y¢

 

  

 

  

            Y¢

Figure 3. Prior and Posterior Probabilities

Notes: This figure displays the prior (the lines) and the posterior histograms for two param- eters of the model with predictable consumption growth. The left column gives the densities for the autoregressive parameter for the hidden state and the right column the mean growth rate of consumption. The results from the first row were generated using a relatively loose prior, including an informative prior on the conditional variance for the hidden state. The prior for the variance is an inverse gamma with shape parameter 10 and scale parameter 27 1.83 3 10 . The implied prior mode for sz is 0.00041. The prior for the AR coefficient is 6 normal conditioned on sz with mean 0 and standard deviation sz 3 1.41 3 10 , truncated to reside between minus one and one. The prior for mc has mean 0.003 and standard deviation 0.27. The results from the second row were generated with a informative prior and fixed the conditional standard deviation for the hidden state at 0.00047. The prior for AR coefficient is normal with mean 0.98 and standard deviation 0.12. The prior for mc is normal with mean 0.0056 and standard deviation 0.00036. The posterior densities were computed using Gibbs sampling with 50,000 draws after ignoring the first 5,000.

B. The Struggling Investors meaningful priors for investors becomes an important specification problem when Bayesian The rational expectations solution of impos- learning is incorporated into a rational expecta- ing parameter values may be too extreme, but, tions asset pricing model and in the extensions for this model, it is also problematic to use that I will consider. Learning will be featured in loose priors. John Geweke (2001) and Martin the next two sections, but before incorporating Weitzman (forthcoming) show dramatic asset this extra dimension, I want to reexamine the return sensitivity to such priors in models with- risk prices derived under rational expectations out consumption predictability. While loose pri- and suggest an alternative interpretation for one ors are useful in presenting statistical evidence, of their components. it is less clear that we should embrace them in In Section I, I gave the risk price vector for models of investor behavior. How to specify an economy with predictable consumption. 12 AEA PAPERS AND PROCEEDINGS MAY 2007

Since investors are endowed with preferences ­variable process {zt} with a mean of 20.002 for which the intertemporal composition of risk and a direct mean decrease in the consumption matters, the presence of consumption predict- growth equation of 20.0001, which is incon- ability alters the prices. Recall the one-period sequential. The contribution to Z pZ measure by risk price vector is the norm of (5) scaled by g 2 1 5 4 is about 0.09. While both distortions lower the average 21 p 5 sc 1 1g 2 12 3sc 1 ba 1I 2 bA2 sz 4. growth rate in consumption, only the second one is substantial. Investors make a conserva- One way to make risk prices large is to endow tive adjustment to the mean of the shock process investors with large values of the risk aversion {u2, t} and hence to the unconditional mean of parameter g. While g is a measure of risk aver- {zt}. This calculation gives a statistical basis for sion in the recursive utility model, Anderson, a sizeable model uncertainty premium as a com- Hansen, and Sargent (2003) give a rather differ- ponent of p. Similar calculations can be made ent interpretation. They imagine that investors easily for other values of g. treat the model as possibly misspecified, and ask While a mean distortion of 20.002 in the con- what forms of model misspecification investors sumption dynamics looks sizable, it is not large fear the most. The answer is a mean shift in the relative to sampling uncertainty. The highly shock vector ut11 that is proportional to the final persistent process {zt} makes inference about term above: consumption growth rates difficult. Moreover, my calculation is sensitive to the inputs that 21 (5) 3sc 1 ba 1I 2 bA2 sz 4. are not measured well by an econometrician. Conditioned on 0.98, the statistical evidence for This is deduced from computing the continua- sz is not very sharp. Reducing sz by one-half tion value for the consumption process. Instead changes the log-likelihood function only by of a measure of risk aversion, g 2 1 is used 0.3. Such a change in sz reduces the Chernoff to quantify an investors’ concern about model rate and the implied mean distortion attributed misspecification. to the {zt} process by factors in excess of three. Is this distortion statistically large? Could Suppose that investors use only data on investors be tolerating statistical departures of aggregate consumption. This presumes a dif- this magnitude because of their concern about ferent model for consumption growth rates, but model misspecification? Our earlier Chernoff one with the same implied probabilities for the calculations are informative. Even with temporal consumption process. This equivalent represen- dependence in the underlying data-generating tation is referred to as the innovations represen- process, the Chernoff discrimination rate is tation in the time series literature and is given by 0s 1 ba1 2 b 2 21s 0 2 2 c I A z 0g 2 10 . ct11 2 ct 5 0.0056 1 z 1 0.0056ut11, 8 t

zt11 5 0.98zt 1 0.00037ut11, Consider, now, the parameter values given in the first model of Example 2. Then where {ut11} is a scalar i.i.d. sequence of stan- dard normally distributed random variables. 21 2 The implied distortions for the consumption 0sc 1 ba1I 2 bA2 sz 0 (6) 0g 2 10 2 growth rate given, say g 5 5, are very close to 8

< 0.0000610g 2 10 2.  For the persistence and volatility parameters assumed in this model, mc is estimated with much less accuracy than that shown in Figure 3. The posteriors reported in this fig- ure assign considerable weight to processes with much less For instance, when b 5 0.998 and g 5 5, the persistence. implied discrimination rate is just about 0.5  As a rough guide, twice the log-likelihood difference is percent per year. This change endows the state a little more than half the mean of a x2(1) random variable. VOL. 97 NO. 2 Richard T. Ely Lecture 13 those I gave previously, based on observing both time in accordance to a Markov process, as in consumption growth and its predictor process. a regime shift model of W.M. Wonham (1964), In this subsection, I used a link between dis- Stanley L. Sclove (1983), and James D. Hamilton torted beliefs and continuation values to reinter- (1989). The signal or outcome s* is observed pret part of the risk price vector p as reflecting in the next time period. If z were observed, we a concern about model misspecification. This is would just use f as the density for the next period a special case of a more general approach called outcome s*. Instead, inferences must be made exponential tilting, an approach that I will about z to deduce the probability distribution for have more to say about in sections VI and VII. s*. For simplicity, we consider the case in which Investors tilt probabilities, in this case means of learning is passive, that is, actions do not alter shocks, in directions that value functions sug- the precision of the signals. gest are most troublesome. This tilting gives a justification for pessimism in beliefs. Stephen G. A. Compound Lottery Cecchetti, Pok-sang Lam, and Nelson C. Mark (2000) and Andrew B. Abel (2002) have shown To apply recent advances in decision theory, how endowing investors with pessimistic beliefs it is advantageous to view the HMM as speci- can help to solve asset pricing puzzles.10 fying a compound lottery repeated over time. While the tilted probabilities in this section Suppose, for the moment, that is observed. are represented as time-invariant mean shifts, Then for each , f 1 · Z 2 is a lottery over the out- by considering learning, I will obtain a source come s*. When is not observed, randomness of time variation for the uncertainty premia. of z makes the probability specification a com- pound lottery. Given a distribution p, we may V. Learning reduce this compound lottery by integrating out over the state space Z: Up until now, we have explored econometric concerns and statistical ambiguity without any ¯ explicit reference to learning. Our next task is to f j 5 j z p z . 1 2 3f 1 0 2 d 1 2 explore the real time implications of learning on what financial econometricians refer to as risk prices. To explore learning in a tractable way, This reduction gives a density for s* that may be consider what is known in many disciplines as a used directly in decision making without knowl- hidden Markov model (HMM). In what follows, edge of . In the applications that interest us, p we let j be a realized value of the signal, while is a distribution conditioned on a history H of s* denotes the signal, which is a random vec- signals.11 tor. We make the analogous distinction between realized values of the state z versus the random B. Recursive Implementation state vector . Suppose that the probability den- sity for a signal or observed outcome s*, given a In an environment with repeated signals, the Markov state , is denoted by f 1 # 0z2. This den- time t distribution, pt, inherits dependence on sity is defined relative to an underlying measure calendar time through the history of signals. dh1j2 over the space of potential signals S. A Bayes’s rule tells us how to update this equa- realized state is presumed to reside in a space Z tion in response to a new signal. Repeated appli- of potential states. cations gives a recursive implementation of In a HMM, the state is disguised from the Bayes’s rule. decision maker. The vector could be (a) a discrete Consider some special cases. indicator of alternative models, (b) an unknown parameter, or (c) a hidden state that evolves over Case 1: Time-Invariant Markov State.—Sup- pose that z is time invariant, as in the case of

10 Abel (2002) also suggests that sampling uncertainty and a concern for robustness might be important compo- nents in justifying pessimism and doubt on the part of pri- 11 Formally, H is a sigma algebra of conditioning events vate agents. generated by current and past signals. 14 AEA PAPERS AND PROCEEDINGS MAY 2007 an unknown parameter or an indicator of a since ef(j Z z) dh(j) 5 1. This implies the familiar model. Let p denote a probability distribution martingale property associated with parameter conditioned on a history H, and let p* denote learning. The best forecast of ef(z)pt11 (dz), giv- the updated probability measure given that s* is en current period information, is ef(z)pt (dz). observed. Bayes’s rule gives: Thus, given the sequence of probability distri- butions {pt(dz) : t 5 0, 1, … .}, the sequence ran- dom variables {ef(z)pt(dz) : t 5 0, 1, … .} is a f 1s*0z2 dp1z2 p*(dz) 5 . martingale. In fact, it is a bounded martingale, ef 1s*0z2 dp1z2 and it necessarily converges. By making an invariant unobservable, we The signal s* enters directly into this evolution have introduced a strong form of stochastic equation. Applying this formula repeatedly for dependence as reflected by the martingale prop- a sequence of signals generates a sequence of erty. Note, however, that the stochastic structure probability distributions {pt} for that reflect will become degenerate as the martingale con- the accumulation of information contained in verges. When learning problems are difficult, current and past signals. the convergence will be slow, as we have seen in Since is time invariant, the constructed state our discussion of Chernoff (1952). probability distribution {pt} is a martingale. Since pt is a probability distribution, this requires an Case 2: Time-Varying Markov State.—Con- explanation. If the set of potential states Z con- sider the case in which is not invariant and its sists of only a finite number of entries, then each evolution is modeled as a Markov process. The of the probabilities is a martingale. More gener- dynamics for this hidden Markov state directly ally, let f be any bounded function of the hidden influence the learning dynamics in ways I will state .12 An example of such a function is the illustrate. Let T(z* Z z) be the transition density so-called indicator function that is one on set and of relative to a measure l(dz) over the hidden zero on its complement. The integral ef(z)dp(z) states. The measure l is chosen for convenience gives the conditional expectation of f( ) when depending upon the details of the application. dp(z) is the conditional distribution for given Later, I will feature examples in which state the current and past signal history H. space Z contains a finite set of values, and the In contrast to p, the distribution p* incorpo- measure l just assigns one to each element of rates information available in the signal s*. Then this set. Other measures are used when is continuous. Our previous calculations extend, except that (7) E f(z) dp*(z) the updating equation for the * posterior dis- s3 ZHt tribution p* must accommodate this evolution. From Bayes’s rule: ef 1s*0z2 f 1z2 dp1z2 5 3s t ef 1s*0z2 dp1z2 (8) p*(dz*) 3 f(s* z)p(dz) dh(s*) s3 Z t 3eT(z* Z z) f (s* Z z) dp(z)4dl(z*) 5 eeT(z˜ Z z) f (s* Z z) dp(z) dl(z˜) 5 f(s* z)f(z) dp(z)dh(s*) 33 Z 5 T¯(s*, p). 5 f(z) dp(z) 3 The distribution p* evolves from p as a function of the signal s* in accordance to the function T¯. Given the stochastic evolution of the hidden state, we lose the martingale property. The case 12 Formally, we also restrict f to be Borel measurable. in which is invariant, considered previously, VOL. 97 NO. 2 Richard T. Ely Lecture 15 is a special case with a degenerate specification the endogenous prices. More generally, I could of the transition law: * 5 . When the hidden introduce private signals and endogenously state has a nondegenerate transition law, we lose determined price signals. This approach to learn- the martingale property. If the transition law T ing is an enrichment of rational expectations to is stochastically stable (that is, there is unique include subjective uncertainty while preserv- stationary distribution associated with T ), then ing the essential equilibrium components. The this asymptotic stability carries over to the evo- resulting equilibrium model is what Margaret lution of the probability distributions: {pt} cap- Bray and David M. Kreps (1987) call learning tured by T¯. within a rational expectations model. After all, Lucas’s (1972b) use of rational expectations in a C. A New Markov Process private information economy has agents learn- ing from price signals, so aspects of learning It follows from what we have just shown that have been central features in rational expecta- we can represent this form of learning as a new tions equilibria from the outset. Markov process. For this new process, the hid- There are other ways to introduce learning den state is replaced by a distribution over the that push us outside the realm of rational expec- hidden state. The density for the signal is tations in a more substantial way. For instance, we might pose the learning challenge directly on the price dynamics or the endogenous state f¯ (j p) 5 f(j z) dp(z), variables. This has led to what Bray and Kreps Z 3 Z (1987) call a model of learning about a ratio- nal expectations equilibrium. Adaptive control and p evolves according to T¯ given in (8). Thus methods or Bayesian methods are applied that we may conceive of learning as justifying a fail to impose some of the internal consistency Markov process with a “state variable” p. conditions of a rational expectations equilib- I derived this learning solution using Bayes’s rium. See Bray (1982), Xiaohong Chen and rule, but, otherwise, we did not appeal to a spe- Halbert White (1998), Albert Marcet and Sargent cific decision problem. The hidden state may be (1989), Sargent (1999), and George W. Evans hidden to the econometrician or it may be sub- and Seppo Honkapohja (2001) for examples jective uncertainty in the minds of investors. If of what has become an important literature in the former, its estimation is a problem only for macroeconomics. Agents are assumed to apply an econometrician. If the latter, both the econo- Bayesian learning methods to misspecified but metrician and the investor being modeled may typically simpler models, or they apply adap- aim to integrate it out or reduce the compound tive methods that aim to provide a more flexible lottery. For instance, I could use this learning approximation. The outcomes of these misspeci- solution to alter the model of exogenous shock ficied Bayesian or adaptive learning algorithms processes such as technology shocks. I simply are fully embraced as beliefs by the economic replace one state variable, , by another, the dis- agents when making forward-looking decisions. tribution p. The recursive solution becomes an There is typically no acknowledgment of the input into our rational expectations model with potential misspecification. The dynamic sys- the additional econometric challenge of specify- tems may have limit points, but they may imply ing an initial condition for p, a priori.13 Since p weaker consistency requirements than a rational is a distribution, for many state spaces it can be expectations equilibrium.14 Since this approach an infinite dimensional state variable, but that is to learning does not presume that decision mak- a computational issue, not a conceptual one. ers in the model fully perceives the uncertainty Given this change, I may define a rational they confront, the resulting equilibria ignore expectations equilibrium to determine the endog- a potentially important source of uncertainty enous state variables such as capital stocks and

14 The weaker equilibrium concept is known as a 13 Moreover, since may be disguised, identification of self-confirming equilibrium. See Sargent (1999) for a its dynamics as captured by T may be more challenging. discussion. 16 AEA PAPERS AND PROCEEDINGS MAY 2007 premia that might show up in prices that clear The first equation gives the continuous-time security markets. The economic agents in such counterpart to f¯, the density for the signal. The models experience no specification doubts. second equation gives the counterpart to T¯, the evolution equation for the probabilities. The # D. Dynamic Learning in a Regime Shift Model Brownian motion increment dBt can be inverted from the signal evolution equation. To illustrate the dynamics of learning, we use There are notable features of this solution. a solution first characterized by Wonham (1964) First, the matrix A used to model the hidden state to a filtering problem that economists sometimes dynamics plays a central role in the dynamics refer to as a regime shift model. The model and for ¯. It enters directly into the formula for the solution are given most conveniently in continu- local conditional mean, A9¯. Second, the new ous time. Consider a signal, information contained in the signal evolution is captured by the increment to the Brownian # # dst 5 k ztdt 1 sdBt, motion dB, which we express as: # where {Bt} is standard Brownian motion, and (9) dBt 5 k · 1 t 2 ¯t2dt 1 dBt. { t} is a hidden state Markov chain with inten- sity matrix A. The intensity matrix conveniently This represents the new information encoded in summarizes the matrix of transition probabili- the signal history as a compound lottery. Both ties for the hidden state via the formula: exp(tA) Brownian motions B and B# are standardized for any positive number t. The realized value (they have unit variance over a time interval of of t is a coordinate vector. Thus, k · t selects length one). The reduced information does not among the entries in the vector k in determining alter the local accuracy of our forecast of the sig- the local growth rate for the signal process. This nal. While this is a special property of continu- specification is a continuous-time counterpart ous-time models with signal noise generated by to the regime switching model of Sclove (1983) a Brownian motion, it does give us an informa- and Hamilton (1989). It has been used in asset tive limiting case. Finally, the vector D contains pricing models by Alexander David (1997) and the local (in time) regression coefficients of the Pietro Veronesi (2000). Given this model, we hidden state onto the new information in the sig- can think of dst conditioned on the state t as a nal. These coefficients depend on the state prob- compound lottery. ability vector ¯t. When one of the entries of ¯t is The Wonham filter gives the solution to close to unity, the vector D is close to zero. reducing the compound lotteries while updat- The following example is of some pedagogi- ing probabilities based on past data. Since t is cal interest because it illustrates how varying a coordinate vector, its conditional expectation, parameters of this model alter the temporal given the signal history, is the vector of hidden dependence of the probabilities and the sensitiv- state conditional probabilities. As for notation, ity of these probabilities to new information. we let ¯t 5 E( t Z Ht ), which is the vector of hid- den state probabilities. Thus, the conditional Example 3: Consider the following two-state mean ¯t contains the vector of state probabilities example. The two-by-two intensity matrix is used to depict pt. The recursive filtering solution parameterized by is a stochastic differential equation represented in terms of an alternative standard Brownian motion { # }: 2a1 a1 Bt A 5 s t , a2 2a2 # dst 5 k · ¯t dt 1 sdBt ,

where a1 $ 0 and a2 $ 0. Since probabilities add up to one, it suffices to consider only one of d ¯t 5 A9¯t dt 1 D(¯)(dst 2 k · ¯t dt) , the probabilities, say the probability of being in state one. Substituting from our parameteriza- D(¯) 5 1 diag(¯)(k 2 1 k · ¯). tion of A, s2 t n t VOL. 97 NO. 2 Richard T. Ely Lecture 17

rate of convergence is precisely what Chernoff’s a2 # 5 2 1 # 2 analysis allows us to investigate. The inferential d 1 2 a b z1,t a1 a2 z1,t 1 dt a1 a2 problem that is presumed in my application of k 2 k the Wonham filter includes a model selection 1 # 1 2 # 2 1 2 # . problem where the invariant state is a model z1, t 1 z1, t a s bdBt indicator indexed by an invariant state. I now use the stochastic structure of the In this example, the unconditional mean of the Wonham (1964) filtering model to explore probability of being in state one is a2/(a1 1 a2). dynamics of model selection. The local volatility of the probability scales with the difference between means relative to the sig- Example 4: Consider an example with three nal volatility. When the difference in the k’s is states. Two states give rise to movements in large relative to s, the probabilities are more the growth rate for consumption. Movements responsive to the new information contained in between states are random and shift the growth the signals. This responsiveness becomes arbi- rate in the signal as in Example 3. The third trarily small if the probability is close to zero state is invariant. It cannot be reached from the or one. If a1 5 a2 5 0, then the probability is first two states. Formally, theA matrix is a positive martingale. When a1 and a2 are both positive, the probability process is asymptoti- cally stationary. Larger values imply more mean 2a1 a1 0 15 reversion in the probabilities. A 5 £ a 2a 0 § , While I feature the Wonham filter in this 2 2 essay, there are other well-known filtering meth- 0 0 0 ods, including the Kalman filter, the particle fil- ter, and the Zakai equation. There are alternative where a1 . 0 and a2 . 0. There is no possible ways of characterizing the solutions to f¯ and T¯. movement from states one and two to state three or from state three to states one and two. While E. Real Time Model Detection the third state is invariant, the decision maker does not know if this third state or regime is the I began this essay by considering a model relevant one. Thus, he faces a model selection detection problem posed by Chernoff (1952). problem. The stochastic specification of the Wonham fil- ter gave me a way to move across regimes in real Given the existence of a time-invariant hid- time, but it also includes time-invariant indica- den state, the dynamic extension of the Chernoff tors of models as envisioned by Chernoff. Such (1952) analysis determines the asymptotic dis- indicators are natural limits of low-frequency crimination rate between models (states one and movements in regimes. Learning about low two versus state three). This leads to the study frequencies will be an important component to of the asymptotic behavior of the filtering solu- some of our calculations, and, therefore, it war- tion when the signal is restricted to spend all its rants special consideration. time in states one and two, or when the signal is When time-invariant indicators are included restricted to spend all its time in state three. In as possibilities, it is no longer fruitful to appeal the former case, the process {# } will converge z3, t to a stochastic steady state. While such steady to zero eventually at an exponential rate, while states exist, they are degenerate and the interest- in the latter case {# 1 # } will converge to z1, t z2, t ing question becomes one of convergence. The zero eventually at this same rate. The local counterpart to Chernoff’s discrimi- nation rate is 15 We do not mean to imply that the drift determines the pull of the process toward the center of its distribu- # # 2 tion. Given that volatility is state dependent, it also plays z1, t z2, t ck a b 1 k a b 2 k d a role in pulling the distribution away from the boundar- 1 # 1 # 2 # 1 # 3 z z z z ies. When volatility is relatively low, the pull by the drift 1,t 2,t 1,t 2,t . is more potent. 8s2 18 AEA PAPERS AND PROCEEDINGS MAY 2007

This rate depends on the local mean difference A. Irreducible Lotteries between the two models where the first model is the original two-state model of Example 3, Segal (1990) studies two-stage lotteries and and the second model has local mean of k3dt axioms that do not imply reduction. Instead, that is time invariant. Small mean differences the conditional composition of risk matters. We across the models relative to the volatility make explore two distinct motivations for why condi- model discrimination challenging. Since this tioning might matter. First, we distinguish the local rate is time varying, as I argued before, the riskiness of s* conditioned on from riskiness asymptotic discrimination rate is an average of over the hidden state or time-invariant parame- this local rate with respect to an appropriately ter z. Klibanoff, Marinacci, and Mukerji (2005) defined mixture model (see Newman and Stuck develop this idea further to distinguish risk or 1979). objective uncertainty, captured by the signal In addition to considering learning with time- density f 1 # 0z2, from subjective uncertainty, invariant hidden states, I will explore the impli- captured by probability distribution p over hid- cations of recent decision theory that will allow den states. They give an axiomatic justification us to feature learning and concerns about model for a convenient representation of preferences. specification but preserve many other useful Let the adjustment for risk conditioned on features of a rational expectations equilibrium. be represented by an increasing concave func- tion h:

VI. Beliefs and Preferences (10) V (a ) 5 h21 j f j h j , Z a3h3a1 2 4 1 0z2 d 1 2b Expected utility theory embraces the axiom that compound lotteries should be reduced. If, as suggested previously, we view f(j* Z z) and where a is some action or decision expressed dp(z) as a compound or two-step lottery, then as a function of the signal. The h21 transforma- the ranking induced by expected utility prefer- tion is convenient because if the random s* can ences depends on the reduced lottery: be perfectly predicted given , the right-hand side of (10) gives the state contingent action. Construct a second-stage ranking based on the f¯ (j) 5 j z p z . utility function 3f1 0 2 d 1 2

The integration defines a lottery that does not (11) g V(a z) dp(z), 3C 3 Z 4D condition on 5 z, and compounding is just a way to depict or even restrict lotteries of interest. Similarly, decisions or actions that depend on s* using the strictly increasing concave utility can be represented as a compound lottery that function g. As a special case, if g 5 h, can be reduced using the density f¯. For the exam- ple economies we explore, the use of expected utility theory implies that rational learning has only modest implications for predicted risk pre- h[a(j)] f (j z) dh(j)dp(z) 33 Z mia. This leads me to employ generalizations of this theory that avoid the presumption that compound lotteries should simply be reduced. 5 h[a(j)] f¯ (j) dh(j). 3 Kreps and Porteus (1978); Uzi Segal (1990); and Peter Klibanoff, Massimo Marinacci, and Sujoy Mukerji (2005) provide alternative decision Preferences that do not reduce compound lot- theories that resist the reduction of compound teries permit h to differ from g. The behav- lotteries. Associated with some of these formu- ioral responses to the two different forms of lations are alternative beliefs that are tilted in risk or uncertainty are allowed to be differ- ways that I characterize. ent. Klibanoff, Marinacci, and Mukerji (2005) VOL. 97 NO. 2 Richard T. Ely Lecture 19 defend this as allowing for a smooth version of ­different interpretations are implications of a 2 ambiguity aversion when g + h 1 is concave. well-known result from applied probability: Following Kreps and Porteus (1978), we may use the same setup to consider a rather differ- ent question. Consider two lotteries. One is (12) min EmV 1 uE(m log m) a(s*), where z is observed at an intermediate m$0, Em51 date. Then V(a Z z) can be thought of as the con- 1 ditional utility at this intermediate date and the 5 2u log E expa2 Vb, initial period utility is given by (11). How does u this lottery compare to a second lottery with the ­identical reduced distribution, but with all infor- where V is a random variable that represents the mation revealed at the final date? The second future value of a stochastic process of consump- lottery uses the density f¯ for s*. At the interme- tion, and m is a random variable used to distort diate date, no new information is revealed about probabilities. The right-hand side of (12) is a the lottery and the resulting valuation is special case of

V¯(a) 5 h21 j f¯ j h j . 21 , c3h3a1 2 4 1 2 d 1 2d h 1E3h1V2 4 2

This valuation is not conditional on any informa- where h is minus the negative exponential func- tion, so at the outset we simply evaluate g at V¯(a) tion parameterized by 1/u: h(V) 5 2exp[2(1/u)V]. to obtain the initial period utility. Provided that As featured by David H. Jacobson (1973), Peter 2 g + h 1 is convex, the first lottery is preferred to Whittle (1981), and others in the control theory the second. The converse is true if this function literature, the left-hand side of (12) offers a is concave. Knowing z at an intermediate date rather different perspective than the apparent alters preferences, even when the a is allowed risk adjustment made on the right-hand side of to depend only on the signal s*. In contrast to (12). The computation, expected utility preferences, the timing of when uncertainty is resolved matters. Thus, there are two rather different moti- EmV, vations for building preferences that depend on more than reduced lotteries: (a) wanting to incorporate a formal distinction between risk for a positive random m with a mean of one, conditioned on a hidden state versus subjective gives an alternative way to form expectations. uncertainty about that state, and (b) wanting Formally, the random variable m induces a preferences that are sensitive to when informa- different probability distribution and the term tion is revealed even when the (reduced) decision uE(m log m) is a convex penalty in the distortion distribution is unaltered. Epstein and Zin (1989) m. The left-hand side of (12) explores expecta- build on this latter motivation by featuring an tions of V using different probability distribu- implied distinction between risk aversion and tions. By setting the parameter u arbitrarily intertemporal substitution. In the next section, I large, probability distortions are penalized so will implement both these modifications to pref- severely as to approximate the original expecta- erences in dynamic settings. Both can amplify tion EV. Finite values of u permit consideration the impact of learning on risk prices. of alternative probability measures subject to penalty. Thus, formula (12) gives an explicit link B. Exponential Tilting between robustness (left-hand side) and risk sensitivity (right-hand side), where the latter is For some convenient parameterizations, there modeled using an exponential risk adjustment. are substantially different interpretations of this Robustness allows us to endow our decision utility representation that will allow us to ex- maker with an operational form of skepticism plore implications of statistical ambiguity. These about his model. It is implemented by the choice 20 AEA PAPERS AND PROCEEDINGS MAY 2007 of a tilted or distorted probability measure in- negative exponential specification of theg func- duced by the minimizing m. The solution is tion to endow decision makers either with a con- cern about the specification of the probabilities 1 assigned to the hidden states (left-hand side) or exp1 2 uV2 (13) 5 , a smooth ambiguity adjustment as in Klibanoff, m 1 E3exp1 2 uV2 4 Marinacci, and Mukerji (2005) (right-hand side). These ideas are developed more formally provided that the denominator is finite. This in several recent papers. While (12) exploits a solution gives what is known as an exponential particular functional form, Fabio Maccheroni, tilting of the original probability. Smaller values Marinacci, and Aldo Rustichini (2006a) pro- of V receive relatively more weight than larger vide an axiomatic justification for a more gen- values in the altered probability distribution. eral version of this penalization formulation The altered distribution is tilted toward states given by the left-hand side of (12), where the with lower continuation values. convex function m log m is replaced by a more The implementation via a tilted probabil- general convex function. Hansen, Sargent, ity turns out to be of considerable value. The Turmuhambetova and Williams (2006) show minimizing solution is useful for represent- how the intertemporal counterpart to (12) is ing uncertainty premia and providing a differ- related to the max-min expected utility of Itzhak ent perspective on the source of those premia. Gilboa and David Schmeidler (1989) by for- Previously, I described the potential role for mally interpreting the penalization parameter u statistical latitude among alternative probability as a Lagrange multiplier on a constraint over a models given data histories. I now have a way family of probability distributions. Maccheroni, to construct these alternative models and to ask Marinacci, and Rustichini (2006b) explore how large the resulting statistical discrepancy more general dynamic formulations of prefer- between the minimized solution and the origi- ences based on penalization. Finally, Hansen nal benchmark probability model is.16 and Sargent (forthcoming) use two versions of While this representation of preferences negative-exponential formulation to address using exponential tilting relies on a particu- simultaneously two forms of misspecification lar parametric structure, it is mathematically described previously: (a) misspecification in the convenient. In what follows, I will apply (12) underlying Markov law for the hidden states, in multiple ways. In dynamic contexts, it is and (b) misspecification of the probabilities most fruitful to work with continuation values assigned to the hidden Markov states.17 for optimal plans because of the usefulness of Bellman-equation methods. First, I will exploit VII. Learning and Uncertainty Premia (12) as applied to future continuation values by either endowing the decision maker with a Empirical evidence suggests that risk premia concern about the specification of Markov state move in response to aggregate fluctuations (e.g., transition probabilities (left-hand side) or a con- see Campbell and John H. Cochrane 1999, and cern about the intertemporal composition of risk Martin Lettau and Sydney Ludvigson (forth- as in Kreps and Porteus (1978), Epstein and Zin coming). I now explore how learning might (1989), and others (right-hand side). Second, by contribute to an explanation for this phenom- characterizing the dependence of future contin- enon. While I will present some highly stylized uation values computed as a function of a hid- models, the lessons from this analysis are infor- den state , I will use (12) in conjunction with a mative for more ambitious quantitative investi- gations. The Wonham (1964) filter will be a key input into our characterization. 16 Typically, the minimizing solution will differ as alter- native choices are considered. It will often be the case that the minimization can be done after maximization of util- ity without changing the value. Thus, a min-max theorem can be invoked. In such circumstances, we can still infer a 17 Epstein and Schneider (2003, 2006) make similar dis- worst case probability distribution by exchanging the order tinctions while developing other interesting formulations of minimization and maximization. and applications of ambiguity aversion and learning. VOL. 97 NO. 2 Richard T. Ely Lecture 21

My characterizations of prices will focus on Following Veronesi (2000), we use the prob- what is usually termed the “local risk return ability model assumed by Wonham (1964) in trade-off.” In continuous-time environments, which the signal is the growth rate in consump- “local” means instantaneous and the trade-off tion. In this specification, the expected growth answers the question: “How do we compen- rate of consumption has infrequent jumps: sate investors for risk borne in the immediate # future?” I use the term “uncertainty premia” to dct 5 k zt 1 sdBt . capture the additional components to pricing that emerge from using the decision theory of By solving the filtering problem, I compute a Section VI that in some way or another does not second evolution for consumption that endows simply reduce compound lotteries. In dynamic investors with less information. In this second economies valuation of cash flow exposure specification, the expected growth rate of con- uncertainty is pertinent for all horizons, not sumption moves continuously as a function of just the immediate one. The recursive nature of the probabilities, the ¯t’s. To an econometrician asset pricing allows us to, in effect, integrate the looking only at consumption data, these two local consequences into implications for longer specifications are indistinguishable. I will make horizons. There is good reason to suspect that reference to both information structures and learning can have a more potent impact for their implications for pricing. valuation over longer horizons. Constructing a In what follows, I compute alternative value model in which learning matters for short-term functions and probability distortions, begin- risk analysis is a tall order, but such a model will ning with expected utility. My approach in this likely pay off by also implying substantial con- paper will be derivation by assertion, and the sequences for risk-return trade-offs for longer interested reader will have to look elsewhere for horizons. formal derivations. For the equilibrium calculation, I imitate a device used in the rational expectations literature A. Continuation Values for Expected Utility (see Lucas and Prescott 1971) by introducing a fictitious social planner. Given a consumption Given the assumption of a unitary elasticity endowment, the role of this planner is to compute of substitution, we look for continuation values value functions and the exponentially slanted of the form, Vt 1 ct, where Vt depends either probabilities associated with these functions. on the state vector t or the hidden state prob- The sole purpose of this planning problem is to abilities ¯t. characterize these implied probability distor- Suppose for the moment that t is observed tions. If production were incorporated, then the as of date t. For a reference point consider dis- planner’s problem would be more ambitious, but counted expected utility in continuous time. In it would still include the computation of these this case, we may represent the continuation distortions. Behind this solution to the planner’s value as Vt 1 ct 5 v · t 1 ct, where v is an n- problem is a counterpart to rational expecta- dimensional vector of numbers. The vector v tions equilibrium with decentralized prices. The satisfies the linear equation planner’s continuation values are the utility val- ues assigned to consumption processes looking (14) 0 5 2dv 1 Av 1 k, forward, and they will be computed as functions of the Markov state using continuous-time ver- and hence v 5 (dI 2 A)21k. The continuation sions of Bellman’s equation. value when t is not observed is Vt 5 v · ¯t 1 ct, Conveniently, the probability distortion asso- which may be computed by applying the Law of ciated with exponential tilting, formula (13), Iterated Expectations or equivalently by reduc- is computed using continuation values. This ing the associated compound lottery. approach can be viewed as a device for comput- When the jumps are observed, there are ing risk premia, as a way to generate alternative two risk components to price: the Brownian beliefs, or as a reflection of statistical ambiguity ­increment dBt and the jump process { t}. Since on the part of investors. It is the latter interpreta- the consumption does not jump (only its condi- tion that I feature here. tional mean jumps), the local risk price for the 22 AEA PAPERS AND PROCEEDINGS MAY 2007 jump component is zero. Since the elasticity of equation collapses to the equation (14) used for substitution is unity, the Brownian motion risk evaluating discounted expected utility. price is s. In the reduced information economy This Bellman equation is the counterpart to in which the jump component is not observed, the right-hand side of (12). Associated with the only the increment dB# is priced. Since the coef- left-hand side is probability distortions induced t# ficients on dBt and dBt are the same for both by exponential tilting. While we will not formally information structures, the local risk prices derive this distortion, it is easy to characterize. remain the same for this economy. In this sense, For this continuous time limit, the exponential the introduction of learning within a rational tilting has a simple impact on the underlying expectations equilibrium is inconsequential for Brownian motion. A constant drift is added of 18 the local risk price vector. the form 2(s/uf ). The negative of this drift is the In defense of rational learning, the prices uncertainty premia added to the risk premium s of cash flows over finite time intervals will be derived for the expected utility model. sensitive to the information structure, and this Under this distorted probability, consumption sensitivity can be substantial depending on evolves according to: the model specification. In order to generate a model in which learning alters local prices, how- 2 s ˜ ever, I explore other preferences as described dct 5 k · ztdt 2 u dt 1 sdBt previously. f

B. Continuation Values and for some standard Brownian increment dB˜t . 2 Exponential Tilting Thus, we have subtracted s /uf from all of the hypothetical growth states. This constant adjust- Consider, next, a modification as in Kreps and ment is a feature of other models as well, includ- Porteus (1978) under the assumption that z can ing the discrete time models of Tallarini (2000) be observed. Let h be the negative exponential and Hansen, Heaton and Li (2006b). function with parameter value uf. This function The transition probabilities for the Markov is used to adjust future continuation values. In process are also distorted by the exponential this case we modify Bellman’s equation: tilting. The transitions to states with the smaller continuation values will be made more prob- v able. The jump risk exposure now has nonzero (15) 0 5 2dv 1 k 2 u diag exp uncertainty premia in contrast to the zero risk f c au b d f premium from the expected utility economy.19 v 1 This gives a continuous time counterpart to 3 A exp 2 2 s21 , the discussion in Section IV. By interpreting the a c a u bd b 2u n f f uncertainty premia as reflecting statistical ambi- guity on the part of investors, Anderson, Hansen, where 1n is an n-dimensional vector of ones. The and Sargent (2003) and Pascal Maenhout (2006) new terms included in the Bellman equation argue that the statistical discrimination analy- adjust the continuation values for risk, both sis of Chernoff (1952) suggests how large this jump risk and the Brownian motion risk (see uncertainty component could plausibly be. It Anderson, Hansen, and Sargent 2003 for a deri- suggests how much statistical latitude there vation). As uf gets arbitrarily large, this Bellman might be in distorting the consumption growth rates from the vantage point of skeptical inves- tors, investors whose doubts about their model ­specification cannot be dismissed easily with 18 Arguably, this conclusion takes time separability in statistical evidence. preferences too literally in a continuous-time model. Ayman Hindy and Chi-fu Huang (1992) and John Heaton (1995) argue that locally durability should be an important feature of preferences specified at high frequencies. Nevertheless, I find this local analysis to be revealing because it shows how 19 See Jun Liu, Jun Pan, and Tan Wang (2005) for a to amplify the role of learning. related example featuring jump risk. VOL. 97 NO. 2 Richard T. Ely Lecture 23

Alternatively, a rational expectations econo- Table 1—Risk and Uncertainty Premia metrician calibrating the model could have made a mistake in building a rational expec- Exponential tilting Exponential tilting Exp. Utility consumption state estimation tations model by not endowing agents with IES 5 1 dynamics dynamics lower potential growth rates for consumption. s s/uf (1/uf)D(z¯) · [0V(z¯)/0z¯] A Chernoff-type calculation based on real data Time invariant Time invariant Time varying controls the extent to which growth rates could be diminished, but they are set at this new level Notes: The value function has functional form: V(z¯) 1 c, s with full investor confidence. is the response of consumption to new information, and D(z¯) is the vector of responses of the probabilities to new This model of investor preferences increases information. the predicted uncertainty premia associated with the Brownian motion increment dB˜t , but it does not cause them to be time varying. I now examine another modification to the model distortion again adds a drift to the Brownian which delivers time-varying premia. motion, but now the drift depends on the state probabilities. Three contributions to the uncer- C. Exponential Tilting and Less Information tainty premia are given in Table 1. While I will not derive this formula, it follows from the Suppose now that the state variable z is not analysis in Hansen, Sargent, Turmuhambetova, observed. Instead it is disguised requiring that and Williams (2006). The first term is the risk statistical inferences be made using the Wonham adjustment from expected utility theory when (1964) filter. I may not just average the solution the IES is unity. The second term is familiar to equation (15) over the hidden states to obtain from our analysis of the model in which the the solution to this problem. Instead, as Kreps jump component is observed.20 The impact of and Porteus (1978) and Epstein and Zin (1989) learning is reflected in the third term, which show, the intertemporal composition of risk depends explicitly on z¯. The second two terms matters. Thus, I must solve a new Bellman equa- depend on the derivatives of the value function tion that includes an alternative risk adjustment and the local volatility. They distort the evolu- to the continuation value tion of consumption and the state probabilities from the Wonham filter via exponential tilting. (16) Unfortunately, we lose some pedagogical sim- plicity because the value function, and hence its V derivative, must be computed numerically. 0 5 2dV 1 k · ¯ 1 ¯9A90 ¯ To illustrate this solution, consider an exam- 0 ple with two states as in Marco Cagetti et al. (2002).21 From Example 3, when there are two 2V hidden states, the vector D has 1 1D9 0 D 2 0¯0¯9 k 2 k # 2 # 1 2 z 11 z 2 a b 2 1,t 1,t s 1 0V 0V 2 £ p · D p 1 2s · D 1 s2§ , 2uf 0¯ 0¯ as its first entry and the negative of this as its second entry. The scale factor # 1 2 # 2 is z1, t 1 z1, t where the value function is V(¯) 1 c. The last term captures the risk adjustment to continua- 20 An astute reader will notice that I have also distorted tion values necessary for the Kreps and Porteus the dynamics for z¯. We retain use of this as a state variable, (1978) recursion (see Darrell Duffie and Epstein but we lose its interpretation as the solution to a simple fil- 1992). tering problem. 21 Cagetti et al. (2002) consider formally a production Again, I use a link to robustness to construct economy and they do not impose a unitary elasticity of an implied change in probability measure. The substitution. 24 AEA PAPERS AND PROCEEDINGS MAY 2007

 increase the uncertainty about an underlying  growth rate regime.22  While this example produces interesting time-  series variation in local uncertainty premia, it does  so by distorting the dynamic evolution equation for the state vector. This includes distorting the price  component originally constructed as Wonham’s  (1964) solution to a filtering problem. Investors  treat state estimates from the Wonham (1964)  filter like any other observable state variable,  and they do not distort the current period state  probabilities. 0 0.2 0.4 0.6 0.8 1 probability D. Estimation Ambiguity Figure 4. Uncertainty Price Function I now explore alternative approaches that Notes: The uncertainty price is the sum of the second and directly distort the state probabilities. To feature third components given in Table 1. It is computed for a the role of ambiguity in the assignment of state two-state Markov chain. To produce this curve, I assumed an intensity matrix A from Cagetti et al. (2002) with off- probabilities, I follow Klibanoff, Marinacci, and diagonal elements equal to 0.0736 and 0.352. The growth Mukerji (2005) and Hansen and Sargent (forth- rates are k1 5 0.0071 and k2 5 0.0013. The value function coming) by introducing a separate adjustment was computed taking a quadratic approximation around for ambiguity over the probabilities assigned to the implied unconditional mean of the probability of being states. in the first state with parameter values uf 5 0.1 and d 5 2log 0.998. Using a continuous-time counterpart to a decision model of Hansen and Sargent (forth- coming) and decomposition (9), we may obtain close to zero when there is a preponderance of a modified version of the Bellman equation (16). evidence for one or the other states. This term To feature the role of hidden states, the fictitious is large when it is hard to tell the two states planner modifies the equation by considering apart, that is when # 5 1 2 # 5 1/2. The the evolution of continuation values prior to the z1, t z1, t actual uncertainty prices depend on value func- information reduction. Even though the value tion derivative as well, but it remains true that function depends only on ¯ and c, its evolu- uncertainty prices become large when there is tion now depends on the realized hidden states. ambiguity about the hidden state probabilities, Hansen and Sargent (forthcoming) introduce a as illustrated in Figure 4. second parameter, say ub, to penalize distortions When Cagetti et al. (2002) fit a technology to the probability vector ¯ used by the planner shock model with two regimes using economet- for computing the averages required for a new ric methods, like Hamilton (1989), they found continuous-time Bellman equation. The result- growth rate regimes that moved over what mac- ing solution remains difficult to compute unless roeconomists typically refer to as the business the number of states is small. cycle. The time series of resulting uncertainty For my numerical examples, I use a second prices are large at dates at which investors do approach suggested by Hansen and Sargent not know which regime they are in: these are (forthcoming). The solution can be easier to dates at which both regime probabilities are one- compute, which we exploit in solving the four half. Repeated observations of low consumption hundred state Markov chain example which growth strengthen investor beliefs that they are ­follows.23 Consider again the Kreps and Porteus in the low growth regime, thereby resolving some of the uncertainty and reducing the pre- mium. I imagine that by including more states, 22 Such extensions are worthwhile, but the value func- tion for this model must be solved numerically. This limits in particular more low-frequency movements in the scope of such an analysis. growth rates, I can modify this outcome so that 23 While computationally simpler, its game theoretic some repeated observations with low growth underpinnings are more subtle. VOL. 97 NO. 2 Richard T. Ely Lecture 25

Table 2—Risk and Uncertainty Prices 0.14

Exponential tilting Exponential tilting 0.12 Exp. Utility consumption state estimation IES 5 1 dynamics dynamics 0.1 s s/u [(z¯ 2 z˜ ) · k]/s f 0.08 Time invariant Time invariant Time varying price 0.06 Notes: The value function as functional form: V(z) 1 c, s is the response of consumption to new information, z¯ is the vector of probabilities from the Wonham filter, z˜ is the 0.04 exponentially tilted counterpart and k is the vector of alter- native growth rates. 0.02

0 1950 1960 1970 1980 1990 2000 date

(1978) recursion conditioned on the hidden state Figure 5. Time Series of Uncertainty Prices . Recall that the value function is v · 1 c, where v is a vector of real numbers. I use the Notes: The uncertainty prices are the sums of the sec- continuation values in conjunction with (12) for ond and third components given in Table 2. The Markov u 5 u to infer a distortion of the hidden state chain has 400 states, 100 for each of 4 submodels. The four b submodels are (i) AR1 with an AR coefficient of 0.97, a probabilities. Recall that the probability dis- shock standard deviation of 0.00058, and an unconditional tortion results in an exponential tilting of the mean of 0.0056; (ii) AR1 with an AR coefficient of 0.98, ­probability assignment toward states with the a shock standard deviation of 0.00047, and unconditional lowest continuation values. That is, let mean of 0.0056; (iii) AR1 with an AR coefficient of 0.99, a shock coefficient of 0.00024, and an unconditional mean of 0.0056; (iv) i.i.d. model with prior on the mean given by 0.0056 and a prior standard deviation of 0.00036. The 2 * vi vi 5 exp a2 b curve was computed assuming that ub 5 24, and the 2 2 ub curve was computed assuming that ub 5 6. The upper two plots were computed assuming that uf 5 0.05, and the lower two plots were computed assuming that uf 5 0.1. For all of for some positive value of the parameter uf. the plots, d 5 2log 0.998. * Large values of ub make vi ’s close to a constant value of unity. The distorted or exponentially tilted probabilities assigned to hidden state are in computing the exponential tilted state prob- *# abilities. The third term reflects the contribution v z z˜i 5 i i,t . of learning. From a robustness standpoint, the t *# g ivi zi, t parameter uf reflects forward-looking skepti- cism about assumed dynamics and the param- These tilted probabilities induce a distortion eter ub a backward-looking skepticism about the in the expected growth rate for consumption constructed state probabilities. and hence add a component to the uncertainty The time series plots in Figure 5 display the premia. sum of the second and third components of the Three contributions to the uncertainty premia uncertainty premia, which are the components are given in Table 2. The first two are familiar associated with probability slanting. I con- from our previous example economies and the struct a Markov chain to approximate four dif- third is unique to this example. The first is a risk ferent parameter configurations or submodels. premia, and the second term is determined by Formally a submodel is a collection of states the continuation values conditioned on and the for which there is no chance of leaving that parameter uf. Both are constant. The third term collection. I design a 400 state Markov chain is unique to this example. Since it depends on to approximate a model selection problem or the hidden state probabilities and their distor- estimation problem for investors. One hundred tions, this term is time varying. Its magnitude states were used for each of the four submod- is determined in part by the parameter ub used els. The corresponding intensity matrix A is 26 AEA PAPERS AND PROCEEDINGS MAY 2007 block diagonal with four blocks. I construct the problem of estimating the parameters gov- the first three submodels by approximating the erning the dynamics of { t}. consumption dynamics given in Example 2, in I find it convenient to think of the first three which the process { t} is hidden from the agents. submodels as three different parameter specifi- I use three different values of the autoregres- cations of predictability in consumption growth. sive parameter 0.97, 0.98, and 0.99. The cor- By including all three in the analysis I have responding coefficients on the shock u2, t11 (the approximated an estimation problem. The fourth conditional standard deviations of the hidden submodel is different because consumption state process) were obtained by maximizing growth rates are not predictable. As a conse- the likelihood over this parameter conditioned quence it implies less long-run uncertainty. The on the autoregressive parameter and the coeffi- signal history of postwar consumption growth cient 0.0054 on the shock u1, t11 to the consump- does not allow investors either to confirm or tion growth rate equation. I use the method of reject this fourth submodel with full confidence. George Tauchen and Robert Hussey (1991) to Probabilities are tilted away from this submodel obtain a discrete state approximation for each based on the continuation values. A string of rel- of these three models. The other 100 states are ative high or relatively low consumption growth all invariant states designed to approximate rates both give evidence for consumption pre- alternative mean growth rates. I apply a stan- dictability. The relatively high growth rates dard quadrature method in constructing the induce less tilting toward the submodels with discrete states.24 For computational purposes predictability in consumption because if con- and for the computation of uncertainty prices, sumption is predictable it should remain high, I take the discrete states literally; but the setup at least temporarily. In contrast, relatively low is designed to be similar to an economy studied growth rates in consumption induce more tilt- in great depth by Hansen and Sargent (2006). ing toward the submodels with predictable con- In that paper, we used Kalman filtering meth- sumption growth and this in turn gives a larger ods for two alternative models and computed uncertainty premia. the sequence of posterior probabilities for these models given sample evidence on consump- VIII. Extensions tion. Here, I use the same data as used in that analysis to solve the filtering problem. There are very special ingredients in my The second term in Table 2 is time invariant. example economies. They were designed in part By changing uf, I alter the level of the uncer- to magnify the impact of learning on uncertainty tainty premia. This is reflected in Figure 5. The prices. On the other hand, there are empirical two lower curves were computed for uf 5 0.10 limitations to these economies that can be antic- and upper curves for uf 5 0.05. A smaller value ipated from previous literature. of uf implies less penalization in the investors’ My example economies have arguably with- search over alternative probability distributions. held too much information from economic The third term in Table 2 induces time series agents. For instance, multiple signals make variation in the uncertainty premia while hav- learning more informative, and it remains valu- ing little impact on the level. The smooth curves able to explore implications that allow for an in Figure 5 are computed for ub 5 24 and the econometrician to understate the knowledge more volatile curves for ub 5 6. By construction, of economic agents. Learning within a rational the time series trajectories are similar to those expectations equilibrium already adds to the reported in the more comprehensive analysis by econometrician’s challenge by requiring an ini- Hansen and Sargent (2006) except that I have tial specification or prior for the beliefs about introduced additional models to approximate hidden Markov states, parameters, or model indicators. The decision theory that we explored avoids the reduction of compound lotteries and thus prevents direct application of the Law of 24 To form an (approximate) intensity matrix for the con- Iterated Expectations as commonly used in tinuous-time Markov chain, I subtracted an identity matrix rational expectations econometrics to deduce from the discrete-time transition probability matrix. robust implications. While econometric analysis VOL. 97 NO. 2 Richard T. Ely Lecture 27 may be more challenging, it is a challenge with IX. Conclusion potentially valuable payoffs. I chose not to feature models in which there The cross-equation restrictions used in ratio- is conditional volatility present in the evolution nal expectations econometrics get much of their for consumption. I did this to show how learning empirical power by endowing agents with more can induce time variation in uncertainty prices precise information than econometricians. This without an additional exogenous source of includes information on endowments, cash flows variation. Low-frequency volatility movements, and technology shocks. The rational expectations however, are a potentially important additional agents have done a lot of unmodeled work before ingredient.25 the econometrician steps in. In this paper I have Similarly, I restricted the IES (intertemporal explored ways to close this informational gap by elasticity of substitution) to be unity to simplify giving economic agents some skepticism about the characterization of value functions and the models they use. I showed how investor con- probability distortions. For models that seek to cerns about statistical ambiguity are reflected in understand better wealth and aggregate stock equilibrium prices. In our example economies, price dynamics, this restriction is problematic I avoided endowing economic agents with full because it implies constant wealth consumption confidence in probability models that are demon- ratios. On the other hand, approximating around strably hard to estimate. By introducing learning an economy with IES 5 1 can be a useful char- within an equilibrium, I showed how learning is acterization device as I illustrated in Section I. reflected in the dynamic evolution of local uncer- My focus on one-period (in discrete time) or tainty prices. These uncertainty premia reflect local (in continuous-time) uncertainty prices investors’ doubts about their probability models. made it more difficult for learning to matter. Learning about very low frequency events includ- If learning matters for short-horizon valuation, ing the primitive model specification can lead to then its impact should be more potent for lon- uncertainty premia that are large when macro- ger horizons. Recent asset pricing literature has economic growth is sluggish. This changes the focused on the role of long-run uncertainty on structure of cross-equation restrictions, but not cash flow valuation. (For example, see Campbell necessarily their potency. While there are other and Tuomo Vuolteenaho 2004; Bansal, Robert possible interpretations for the equilibrium out- F. Dittmar, and Christian T. Lundblad 2005; comes I displayed, including changing beliefs or Jesus Santos and Pietro Veronesi 2005; Hansen, embracing preferences that decompose risks in Heaton, and Li 2006; and Mariano Croce, alternative ways, I find the relation to statistical Martin Lettau, and Sydney C. Ludvigson 2006.) ambiguity to be the most appealing. Since statistical measurements for long-horizons There are analogous questions regarding the are known to be fragile, formally incorporat- role of uncertainty in the exploration of hypo- ing learning into such analysis is an obvious but thetical government interventions. The models important extension. I used drew a distinction between risks con- My models imposed homogeneity on inves- ditioned on a hidden model specification or a tors. This allowed me to compute a single tilted hidden state, and uncertainty about that speci- probability model and simplified my analy- fication or hidden state. If this distinction is sis. While introducing heterogeneity among important in understanding evidence from secu- ­investors will complicate model solution, it has rity market data, then use of this evidence in the intriguing possibilities. The investors will slant analysis of stochastic interventions will require probabilities in different directions giving rise a careful accounting of the probability struc- of a form of ex post heterogeneity in beliefs. ture of the policy intervention. What skepticism There is much more to be done. will economic agents have about the alternative probability structure, and what role will learn- ing play in validating or altering beliefs? While such distinctions are not typical in the formal 25 Weitzman (2007) has recently shown that for some priors on volatility, learning can be particularly challeng- analysis of policy changes, perhaps they should ing and consequently can have a big impact on the pre- become part of the normative vocabulary as dicted asset returns. argued in the context of monetary policy by 28 AEA PAPERS AND PROCEEDINGS MAY 2007

Milton Friedman. Rational expectations models Cecchetti, Stephen G., Pok-sang Lam, and Nelson have been demonstrably successful in featuring C. Mark. 2000. “Asset Pricing with Distorted the role of credibility in policy making, but there Beliefs: Are Equity Returns Too Good to Be is scope to explore further the role of beliefs, True?” American Economic Review, 90(4): doubts, and learning in a formal way. 787–805. Chen, Xiaohong, and Halbert White. 1998. “Non­ references parametric Adaptive Learning with Feedback.” Journal of Economic Theory, 82(1): 190–222. Abel, Andrew B. 2002. “An Exploration of the Chernoff, Herman. 1952. “A Measure of Asymp­ Effects of Pessimism and Doubt on Asset totic Efficiency for Tests of a Hypothesis Returns.” Journal of Economic Dynamics and Based on the Sum of Observations.” Annals of Control, 26(7–8): 1075–92. Mathematical Statistics, 23: 493–507. Anderson, Evan W., Lars Peter Hansen, and Croce, Mariano M., Martin Lettau, and Sydney Thomas J. Sargent. 2003. “A Quartet of Semi­ C. Ludvigson. 2006. “Investor Information, groups for Model Specification, Robustness, Long-Run Risk, and the Duration of Risky Prices of Risk, and Model Detection.” Journal Assets.” Unpublished. of the European Economic Association, 1(1): Cumby, Robert E., John Huizinga, and Maurice 68–123. Obstfeld. 1983. “Two-Step Two-Stage Least Bansal, Ravi, Robert F. Dittmar, and Christian Squares Estimation in Models with Rational T. Lundblad. 2005. “Consumption, Dividends, Expectations.” Journal of Econometrics, 21(3): and the Cross Section of Equity Returns.” 333–55. Journal of Finance, 60(4): 1639–72. David, Alexander. 1997. “Fluctuating Confidence Bansal, Ravi, and . 2004. “Risks for in Stock Markets: Implications for Returns and the Long Run: A Potential Resolution of Asset Volatility.” Journal of Financial and Quanti­ Pricing Puzzles.” Journal of Finance, 59(4): tative Analysis, 32(4): 427–62. 1481–1509. Duffie, Darrell, and Larry G. Epstein. 1992. “Sto­ Bray, Margaret. 1982. “Learning, Estimation, chastic Differential Utility.” Econometrica, and the Stability of Rational Expectations.” 60(2): 353–94. Journal of Economic Theory, 26(2): 318–39. Epstein, Larry G., and Martin Schneider.  2003. Bray, Margaret, and David M. Kreps.  1987. “Recursive Multiple-Priors.” Journal of Eco­ “Rational Learning and Rational Expecta­ nomic Theory, 113(1): 1–31. tions.” In Arrow and the Ascent of Modern Epstein, Larry G., and Martin Schneider. 2006. Economic Theory, ed. George R. Feiwel, 597– “Learning under Ambiguity.” Unpublished. 625. New York: New York University Press. Epstein, Larry G., and Stanley E. Zin.  1989. Cagetti, Marco, Lars Peter Hansen, Thomas J. “Substitution, Risk Aversion, and the Temporal Sargent, and Noah Williams.  2002. “Robust­ Behavior of Consumption and Asset Returns: ness and Pricing with Uncertain Growth.” A Theoretical Framework.” Econometrica, Review of Financial Studies, 15(2): 363–404. 57(4): 937–69. Campbell, John Y.  1996. “Understanding Risk Evans, George W., and Seppo Honkapohja. 2001. and Return.” Journal of Political Economy, Learning and Expectations in Macroeconom­ 104(2): 298–345. ics. Princeton: Princeton University Press. Campbell, John Y., and John H. Cochrane.  Geweke, John.  2001. “A Note on Some Limita­ 1999. “By Force of Habit: A Consumption- tions of CRRA Utility.” Economics Letters, Based Explanation of Aggregate Stock Market 71(3): 341–45. Behavior.” Journal of Political Economy, Gibbons, Michael R., Stephen A. Ross, and Jay 107(2): 205–51. Shanken.  1989. “A Test of the Efficiency of Campbell, John Y., and Robert J. Shiller.  1988. a Given Portfolio.” Econometrica, 57(5): “Stock Prices, Earnings, and Expected Divi­ 1121–52. dends.” Journal of Finance, 43(3): 661–76. Gilboa, Itzhak, and David Schmeidler. 1989. Campbell, John Y., and Tuomo Vuolteenaho. “Maxmin Expected Utility with Non-Unique 2004. “Bad Beta, Good Beta.” American Eco­ Prior.” Journal of Mathematical Economics, nomic Review, 94(5): 1249–75. 18(2): 141–53. VOL. 97 NO. 2 Richard T. Ely Lecture 29

Hamilton, James D. 1989. “A New Approach to Hansen, Lars Peter, and Kenneth J. Singleton. the Economic Analysis of Nonstationary Time 1982. “Generalized Instrumental Variables Series and the Business Cycle.” Econometrica, Estimation of Nonlinear Rational Expectations 57(2): 357–84. Models.” Econometrica, 50(5): 1269–86. Hansen, Lars Peter. 1982. “Large Sample Proper­ Hansen, Lars Peter, Thomas J. Sargent, Gauhar ties of Generalized Method of Moments Esti­ Turmuhambetova, and Noah Williams.  2006. mators.” Econometrica, 50(4): 1029–54. “Robust Control and Model Misspecification.” Hansen, Lars Peter, John C. Heaton, Junghoon Journal of Economic Theory, 128(1): 45–90. Lee, and Nikolai Roussanov.  Forthcoming. Hayashi, Fumio, and Christopher A. Sims. 1983. “Risk Aversion and Intertemporal Substitu­ “Nearly Efficient Estimation of Time Series tion.” In Handbook of Econometrics, ed. James Models with Predetermined, but Not Exog­ J. Heckman. Amsterdam: Elsevier Science, enous, Instruments.” Econometrica, 51(3): North-Holland. 783–98. Hansen, Lars Peter, John C. Heaton, and Nan Li. Heaton, John. 1995. “An Empirical Investigation 2006. “Consumption Strikes Back? Measuring of Asset Pricing with Temporally Dependent Long-Run Risk.” Unpublished. Preference Specifications.” Econometrica, Hansen, Lars Peter, and Ravi Jagannathan. 1991. 63(3): 681–717. “Implications of Security Market Data for Hindy, Ayman, and Chi-fu Huang. 1992. “Inter­ Models of Dynamic Economies.” Journal of temporal Preferences for Uncertain Con­ Political Economy, 99(2): 225–62. sumption: A Continuous Time Approach.” Hansen, Lars Peter, and Scott F. Richard.  1987. Econometrica, 60(4): 781–801. “The Role of Conditioning Information in Jacobson, David H.  1973. “Optimal Stochastic Deducing Testable Restrictions Implied by Linear Systems with Exponential Performance Dynamic Asset Pricing Models.” Econo­ Criteria and Their Relation to Deterministic metrica, 55(3): 587–613. Differential Games.” IEEE Transactions for Hansen, Lars Peter, William Robards, and Automatic Control, 18: 1124–31. Thomas J. Sargent. 1991. “Time Series Implica­ Klibanoff, Peter, Massimo Marinacci, and Sujoy tions of Present Value Budget Balance and of Mukerji. 2005. “A Smooth Model of Decision Martingale Models of Consumption and Taxes.” Making under Ambiguity.” Econometrica, In Rational Expectations Econometrics, ed. 73(6): 1849–92. Lars Peter Hansen and Thomas J. Sargent, Kreps, David M., and Evan L. Porteus. 1978. 121–61. Boulder, CO: Westview Press. “Temporal Resolution of Uncertainty and Hansen, Lars Peter, and Thomas J. Sargent. 1980. Dynamic Choice Theory.” Econometrica, “Formulating and Estimating Dynamic Linear 46(1): 185–200. Rational Expectations Models.” Journal of Lettau, Martin, and Sydney Ludvigson.  Forth­ Economic Dynamics and Control, 2(1): 7–46. coming. “Measuring and Modeling Variation Hansen, Lars Peter, and Thomas J. Sargent.  in the Risk-Return Trade-Off.” In Handbook 1991. “Exact Linear Rational Expectations of , ed. Yacine Ait- Models: Specification and Estimation.” In Sahalia and Lars Peter Hansen. Amsterdam: Rational Expectations Econometrics, ed. Lars Elsevier Science, North-Holland. Peter Hansen and Thomas J. Sargent, 45–76. Liu, Jun, Jun Pan, and Tan Wang.  2005. “An Boulder, CO: Westview Press. Equilibrium Model of Rare-Event Premia and Hansen, Lars Peter, and Thomas J. Sargent. Its Implication for Option Smirks.” Review of 2006. “Fragile Beliefs and Price of Model Financial Studies, 18(1): 131–64. Uncertainty.” Unpublished. Lucas, Robert E., Jr.  1972a. “Econometric Hansen, Lars Peter, and Thomas J. Sargent.  Testing of the Natural Rate Hypothesis.” In Forthcoming. “Recursive Robust Estimation The Econometrics of Price Determination, ed. and Control without Commitment.” Journal of Otto Eckstein. Washington, DC: Board of Gov­ Economic Theory. ernors of the Federal Reserve System, 50–59. Hansen, Lars Peter, and Jose Scheinkman. 2006. Lucas, Robert E., Jr. 1972b. “Expectations and “Long Term Risk: An Operator Approach.” the Neutrality of Money.” Journal of Economic Unpublished. Theory, 4(2): 103–24. 30 AEA PAPERS AND PROCEEDINGS MAY 2007

Lucas, Robert E., Jr. 1978. “Asset Prices in an Sargent, Thomas J. 1973. “Rational Expectations, Exchange Economy.” Econometrica, 46(6): the Real Rate of Interest, and the Natural Rate 1429–45. of Unemployment.” Brookings Papers on Eco­ Lucas, Robert E., Jr., and Edward C. Prescott.  nomic Activity, 2: 429–72. 1971. “Investment under Uncertainty.” Econo­ Sargent, Thomas J.  1999. The Conquest of metrica, 39(5): 659–81. American Inflation. Princeton: Princeton Uni­ Maccheroni, Fabio, Massimo Marinacci, and Aldo versity Press. Rustichini. 2006a. “Ambiguity Aversion, Robust- Sclove, Stanley L. 1983. “Time-Series Segmen­ ness, and the Variational Representation of tation: A Method and a Model.” Information Preferences.” Econometrica, 74(6): 1147–98. Science, 29: 7–25. Maccheroni, Fabio, Massimo Marinacci, and Segal, Uzi.  1990. “Two-Stage Lotteries without Aldo Rustichini. 2006b. “Dynamic Variational the Reduction Axiom.” Econometrica, 58(2): Preferences.” Journal of Economic Theory, 349–77. 128(1): 4–44. Sharpe, William F. 1964. “Capital Asset Prices: Maenhout, Pascal. 2006. “Robust Portfolio Rules A Theory of Market Equilibrium.” Journal of and Detection-Error Probabilities for a Mean- Finance, 19(3): 425–442. Reverting Risk Premium.” Journal of Eco­ Shiller, Robert J. 1972. “Rational Expectations nomic Theory, 128(1): 136–63. and the Structure of Interest Rates.” PhD Marcet, Albert, and Thomas J. Sargent. 1989. diss. Massachusetts Institute of Technology. “Convergence of Least Squares Learning Shiller, Robert J. 1982. “Consumption, Asset Mechanisms in Self-Referential Linear Sto­ Markets and Macroeconomic Fluctuations.” chastic Models.” Journal of Economic Theory, Carnegie-Rochester Conference Series on 48(2): 337–68. Public Policy, 17: 203–38. Mehra, Rajnish, and Edward C. Prescott. 1985. Tallarini, Thomas D., Jr.  2000. “Risk-Sensitive “The Equity Premium: A Puzzle.” Journal of Real Business Cycles.” Journal of Monetary Monetary Economics, 15(2): 145–61. Economics, 45(3): 507–32. Muth, John F.  1961. “Rational Expectations Tauchen, George, and Robert Hussey.  1991. and the Theory of Price Movements.” Econo­ “Quadrature-Based Methods for Obtain- metrica, 29(3): 315–335. ing Approximate Solutions to Nonlinear Newman, Charles M., and Barton Stuck, W. 1979. Asset Pricing Models.” Econometrica, 59(2): “Chernoff Bounds for Discriminating between 371–96. Two Markov Processes.” Stochastics, 2: 139–53. Veronesi, Pietro. 2000. “How Does Information Restoy, Fernando, and Philippe Weil.  1998. Quality Affect Stock Returns?” Journal of “Approximate Equilibrium Asset Prices.” Finance, 55(2): 807–37. National Bureau of Economic Research Weitzman, Martin.  Forthcoming. “Prior Sensi­ Working Paper 6611. tive Expectations and Asset Return Puzzles.” Santos, Jesus, and Pietro Veronesi. 2005. “Cash- American Economic Review. Flow Risk, Discount Risk, and the Value Whittle, Peter. 1981. “Risk Sensitive Linear Quad­ Premium.” National Bureau of Economic ratic Gaussian Control.” Advances in Applied Research Working Paper 11816. Probability, 13: 764–77. Saracoglu, Rusdu, and Thomas J. Sargent. 1978. Wonham, W. M. 1964. “Some Applications of “Seasonality and Portfolio Balance under Stochastic Differential Equations to Optimal Rational Expectations.” Journal of Monetary Nonlinear Filtering.” SIAM Journal on Con­ Economics, 4(3): 435–58. trol, 2(3): 347–69. This article has been cited by:

1. Hyosung Kwon, Jianjun Miao. 2019. WOODFORD'S APPROACH TO ROBUST POLICY ANALYSIS IN A LINEAR-QUADRATIC FRAMEWORK. Macroeconomic Dynamics 23:5, 1895-1920. [Crossref] 2. Jianjun Miao, Bin Wei, Hao Zhou. 2019. Ambiguity Aversion and the Variance Premium. Quarterly Journal of Finance 09:02, 1950003. [Crossref] 3. Harjoat S. Bhamra, Raman Uppal. 2019. Does Household Finance Matter? Small Financial Errors with Large Social Costs. American Economic Review 109:3, 1116-1154. [Abstract] [View PDF article] [PDF with links] 4. RAVI JAGANNATHAN, BINYING LIU. 2019. Dividend Dynamics, Learning, and Expected Stock Index Returns. The Journal of Finance 74:1, 401-448. [Crossref] 5. Julian Kozlowski, Laura Veldkamp, Venky Venkateswaran. 2019. The Tail That Keeps the Riskless Rate Low. NBER Macroeconomics Annual 33, 253. [Crossref] 6. A Ronald Gallant, Mohammad R Jahan-Parvar, Hening Liu. 2018. Does Smooth Ambiguity Matter for Asset Pricing?. The Review of Financial Studies 43. . [Crossref] 7. Masakatsu Okubo. 2018. On the computation of detection error probabilities under normality assumptions. Economics Letters 171, 106-109. [Crossref] 8. Brian Hill, Tomasz Michalski. 2018. Risk versus ambiguity and international security design. Journal of International Economics 113, 74-105. [Crossref] 9. Fabrice Collard, Sujoy Mukerji, Kevin Sheppard, Jean-Marc Tallon. 2018. Ambiguity and the historical equity premium. Quantitative Economics 9:2, 945-993. [Crossref] 10. Lars Peter Hansen. 2018. Comment. NBER Macroeconomics Annual 32:1, 479-489. [Crossref] 11. Andras Fulop, Junye Li, Runqing WAN. 2018. Real-Time Learning and Bond Return Predictability. SSRN Electronic Journal . [Crossref] 12. Sergiy Verstyuk. 2018. Ignorance and Indifference: Decision-Making in the Lab and in the Market. SSRN Electronic Journal . [Crossref] 13. Qian Qi. 2018. Robust Tobin's Q Theory. SSRN Electronic Journal . [Crossref] 14. Daniele Bianchi, Jacopo Piana. 2018. Adaptive Expectations and Commodity Risk Premia. SSRN Electronic Journal . [Crossref] 15. Sergiy Verstyuk. 2018. Log, Stock and Two Simple Lotteries. SSRN Electronic Journal . [Crossref] 16. Guihai Zhao. 2017. Confidence, bond risks, and equity returns. Journal of Financial Economics 126:3, 668-688. [Crossref] 17. Ian Jewitt, Sujoy Mukerji. 2017. Ordering ambiguous acts. Journal of Economic Theory 171, 213-267. [Crossref] 18. Stefano d'Addona. 2017. LONG-RUN RISK AND MONEY MARKET RATES: AN EMPIRICAL ASSESSMENT. Macroeconomic Dynamics 21:4, 1096-1117. [Crossref] 19. Chun-Lei Yang, Lan Yao. 2017. Testing ambiguity theories with a mean-preserving design. Quantitative Economics 8:1, 219-238. [Crossref] 20. Emi Nakamura, Dmitriy Sergeyev, Jón Steinsson. 2017. Growth-Rate and Uncertainty Shocks in Consumption: Cross-Country Evidence. American Economic Journal: Macroeconomics 9:1, 1-39. [Abstract] [View PDF article] [PDF with links] 21. Gurdip Bakshi, Xiaohui Gao Bakshi, Jinming Xue. 2017. An Approach to Measure the Expectation of Generic Functions of the Market Return. SSRN Electronic Journal . [Crossref] 22. Lars Peter Hansen, Thomas J. Sargent. 2017. Prices of Macroeconomic Uncertainties with Tenuous Beliefs. SSRN Electronic Journal . [Crossref] 23. Balint Szoke. 2017. Estimating Robustness. SSRN Electronic Journal . [Crossref] 24. A. Ronald Gallant, Mohammad R. Jahan-Parvar, Hening Liu. 2017. Does Smooth Ambiguity Matter for Asset Pricing?. SSRN Electronic Journal . [Crossref] 25. Philippe Andrade, Richard K. Crump, Stefano Eusepi, Emanuel Moench. 2016. Fundamental disagreement. Journal of Monetary Economics 83, 106-128. [Crossref] 26. Zhongjun Qu, Denis Tkachenko. 2016. Global Identification in DSGE Models Allowing for Indeterminacy. The Review of Economic Studies rdw048. [Crossref] 27. Adrian Buss, Bernard Dumas, Raman Uppal, Grigory Vilkov. 2016. The intended and unintended consequences of financial-market regulations: A general-equilibrium analysis. Journal of Monetary Economics 81, 25-43. [Crossref] 28. Demian Pouzo, Ignacio Presno. 2016. Sovereign Default Risk and Uncertainty Premia. American Economic Journal: Macroeconomics 8:3, 230-266. [Abstract] [View PDF article] [PDF with links] 29. Roland Füss, Thomas Gehrig, Philipp B. Rindler. 2016. Changing Risk Perception and the Time- Varying Price of Risk*. Review of Finance 20:4, 1549-1585. [Crossref] 30. MICHAEL JOHANNES, LARS A. LOCHSTOER, YIQUN MOU. 2016. Learning about Consumption Dynamics. The Journal of Finance 71:2, 551-600. [Crossref] 31. Patrick Augustin, Roméo Tédongap. 2016. Real Economic Shocks and Sovereign Credit Risk. Journal of Financial and Quantitative Analysis 51:2, 541-587. [Crossref] 32. Pierre Collin-Dufresne, Michael Johannes, Lars A. Lochstoer. 2016. Parameter Learning in General Equilibrium: The Asset Pricing Implications. American Economic Review 106:3, 664-698. [Abstract] [View PDF article] [PDF with links] 33. Emanuele Borgonovo, Elmar Plischke. 2016. Sensitivity analysis: A review of recent advances. European Journal of Operational Research 248:3, 869-887. [Crossref] 34. Itzhak Gilboa, Massimo Marinacci. Ambiguity and the Bayesian Paradigm 385-439. [Crossref] 35. Bo Liu, Lei Lu, Jinqiang Yang. 2016. Heterogeneous Preferences, Investment and Asset Pricing. SSRN Electronic Journal . [Crossref] 36. Adrian Buss, Bernard Dumas, Raman Uppal, Grigory Vilkov. 2016. The Intended and Unintended Consequences of Financial-Market Regulations: A General Equilibrium Analysis. SSRN Electronic Journal . [Crossref] 37. Harjoat Singh Bhamra, Raman Uppal. 2016. Does Household Finance Matter? Small Financial Errors with Large Social Costs. SSRN Electronic Journal . [Crossref] 38. Harjoat Singh Bhamra, Raman Uppal. 2016. Do Individual Behavioral Biases Affect Financial Markets and the Macroeconomy?. SSRN Electronic Journal . [Crossref] 39. Bong-Gyu Jang, Kyeong Tae Kim, Hyun-Tak Lee. 2016. Mark-to-Market Reinsurance and Portfolio Selection: Implications for Information Quality. SSRN Electronic Journal . [Crossref] 40. Sergiy Verstyuk. 2016. Log, Stock and Two Simple Lotteries. SSRN Electronic Journal . [Crossref] 41. Shuting (Sophia) Hu, Shane A. Johnson, Yan Liu. 2016. The Asset Pricing Implications of Disclosed Risk Factors. SSRN Electronic Journal . [Crossref] 42. Lars Peter Hansen, Thomas J. Sargent. 2016. Prices of Macroeconomic Uncertainties with Tenuous Beliefs. SSRN Electronic Journal . [Crossref] 43. Jerry Tsai, Jessica A. Wachter. 2015. Disaster Risk and Its Implications for Asset Pricing. Annual Review of Financial Economics 7:1, 219-252. [Crossref] 44. Massimo Marinacci. 2015. MODEL UNCERTAINTY. Journal of the European Economic Association 13:6, 1022-1100. [Crossref] 45. Emanuele Borgonovo, Massimo Marinacci. 2015. Decision analysis under ambiguity. European Journal of Operational Research 244:3, 823-836. [Crossref] 46. Luca Benzoni, Pierre Collin-Dufresne, Robert S. Goldstein, Jean Helwege. 2015. Modeling Credit Contagion via the Updating of Fragile Beliefs. The Review of Financial Studies 28:7, 1960-2008. [Crossref] 47. Laurent E. Calvet, Veronika Czellar. 2015. Through the looking glass: Indirect inference via simple equilibria. Journal of Econometrics 185:2, 343-358. [Crossref] 48. Andras Fulop, Junye Li, Jun Yu. 2015. Self-Exciting Jumps, Learning, and Asset Pricing Implications. Review of Financial Studies 28:3, 876-912. [Crossref] 49. Daehee Jeong, Hwagyun Kim, Joon Y. Park. 2015. Does ambiguity matter? Estimating asset pricing models with a multiple-priors recursive utility. Journal of Financial Economics 115:2, 361-382. [Crossref] 50. Ting Wu, Xiaoneng Zhu. 2015. Model Uncertainty and Bond Pricing. SSRN Electronic Journal . [Crossref] 51. Christian C. Opp. 2015. Intertemporal Information Acquisition and Investment Dynamics. SSRN Electronic Journal . [Crossref] 52. Marco Giacoletti, Kristoffer Laursen, Kenneth J. Singleton. 2015. Learning, Dispersion of Beliefs, and Risk Premiums in an Arbitrage-Free Term Structure Model. SSRN Electronic Journal . [Crossref] 53. Demian Pouzo, Ignacio Presno. 2015. Sovereign Default Risk and Uncertainty Premia. SSRN Electronic Journal . [Crossref] 54. Ravi Jagannathan, Binying Liu. 2015. Dividend Dynamics, Learning, and Expected Stock Index Returns. SSRN Electronic Journal . [Crossref] 55. Julian Kozlowski, Laura Veldkamp, Venky Venkateswaran. 2015. The Tail that Wags the Economy: Belief-Driven Business Cycles and Persistent Stagnation. SSRN Electronic Journal . [Crossref] 56. Anna Orlik. 2015. Understanding Uncertainty Shocks and the Role of Black Swans. SSRN Electronic Journal . [Crossref] 57. Jaroslav Borovička, Lars Peter Hansen. 2014. Examining macroeconomic models through the lens of asset pricing. Journal of Econometrics 183:1, 67-90. [Crossref] 58. Hui Chen, Nengjiu Ju, Jianjun Miao. 2014. Dynamic asset allocation with ambiguous return predictability. Review of Economic Dynamics 17:4, 799-823. [Crossref] 59. Mohammad R. Jahan-Parvar, Hening Liu. 2014. Ambiguity Aversion and Asset Prices in Production Economies. Review of Financial Studies 27:10, 3060-3097. [Crossref] 60. A. M. Viale, L. Garcia-Feijoo, A. Giannetti. 2014. Safety First, Learning Under Ambiguity, and the Cross-Section of Stock Returns. Review of Asset Pricing Studies 4:1, 118-159. [Crossref] 61. Alexander David, Pietro Veronesi. 2014. Investors' and Central Bank's Uncertainty Embedded in Index Options. Review of Financial Studies 27:6, 1661-1716. [Crossref] 62. Valerio Potì, Richard M. Levich, Pierpaolo Pattitoni, Paolo Cucurachi. 2014. Predictability, trading rule profitability and learning in currency markets. International Review of Financial Analysis 33, 117-129. [Crossref] 63. MICHAEL JOHANNES, ARTHUR KORTEWEG, NICHOLAS POLSON. 2014. Sequential Learning, Predictability, and Optimal Portfolio Returns. The Journal of Finance 69:2, 611-644. [Crossref] 64. Harjoat S. Bhamra, Raman Uppal. 2014. Asset Prices with Heterogeneity in Preferences and Beliefs. Review of Financial Studies 27:2, 519-580. [Crossref] 65. Brian Hill, Tomasz Kamil Michalski. 2014. Risk versus Ambiguity and International Security Design. SSRN Electronic Journal . [Crossref] 66. Colm Kearney. 2013. Business Education in Asia and Australasia: Recent Trends and Future Prospects. Journal of Teaching in International Business 24:3-4, 214-227. [Crossref] 67. Massimo Guidolin, Francesca Rinaldi. 2013. Ambiguity in asset pricing and portfolio choice: a review of the literature. Theory and Decision 74:2, 183-217. [Crossref] 68. Antoine Bommier, Francois Le Grand. 2013. A Robust Approach to Risk Aversion. SSRN Electronic Journal . [Crossref] 69. Antoine Bommier, Francois Le Grand. 2013. A Robust Approach to Risk Aversion. SSRN Electronic Journal . [Crossref] 70. Hui Chen, Winston Wei Dou, Leonid Kogan. 2013. Measuring the 'Dark Matter' in Asset Pricing Models. SSRN Electronic Journal . [Crossref] 71. Steven Wei Ho. 2013. Learning in International Markets and a Rational Expectation Approach to the Contagion Puzzle. SSRN Electronic Journal . [Crossref] 72. Laurent E. Calvet, Veronika Czellar. 2013. Through the Looking Glass: Indirect Inference via Simple Equilibria. SSRN Electronic Journal . [Crossref] 73. Larry G. Epstein, Emmanuel Farhi, Tomasz Strzalecki. 2013. How Much Would You Pay to Resolve Long-Run Risk?. SSRN Electronic Journal . [Crossref] 74. Cosmin Ilut. 2012. Ambiguity Aversion: Implications for the Uncovered Interest Rate Parity Puzzle. American Economic Journal: Macroeconomics 4:3, 33-65. [Abstract] [View PDF article] [PDF with links] 75. H. Chen, S. Joslin, N.-K. Tran. 2012. Rare Disasters and Risk Sharing with Heterogeneous Beliefs. Review of Financial Studies 25:7, 2189-2224. [Crossref] 76. Colm Kearney. 2012. Emerging markets research: Trends, issues and future directions. Emerging Markets Review 13:2, 159-183. [Crossref] 77. ĽUBOŠ PÁSTOR, ROBERT F. STAMBAUGH. 2012. Are Stocks Really Less Volatile in the Long Run?. The Journal of Finance 67:2, 431-478. [Crossref] 78. Lars Peter Hansen. 2012. Comment. NBER Macroeconomics Annual 26:1, 132-143. [Crossref] 79. Andreas Fuster, Benjamin Hebert, David Laibson. 2012. Natural Expectations, Macroeconomic Dynamics, and Asset Pricing. NBER Macroeconomics Annual 26:1, 1-48. [Crossref] 80. Valerio Potì, Akhtar R. Siddique. 2012. GMM-Based Tests of Efficient Market Learning and an Application to Testing for a Small Firm Effect in Equity Pricing. SSRN Electronic Journal . [Crossref] 81. Lars Peter Hansen. 2012. Challenges in Identifying and Measuring Systemic Risk. SSRN Electronic Journal . [Crossref] 82. Pierre Collin-Dufresne, Michael Johannes, Lars A. Lochstoer. 2012. Parameter Learning in General Equilibrium: The Asset Pricing Implications. SSRN Electronic Journal . [Crossref] 83. Alexander David, Pietro Veronesi. 2012. Investors' and Central Bank's Uncertainty Measures Embedded in Index Options. SSRN Electronic Journal . [Crossref] 84. Mohammad R. Jahan-Parvar, Hening Liu. 2012. Ambiguity Aversion and Asset Prices in Production Economies. SSRN Electronic Journal . [Crossref] 85. Jianjun Miao, Bin Wei, Hao Zhou. 2012. Ambiguity Aversion and Variance Premium. SSRN Electronic Journal . [Crossref] 86. Raj Aggarwal, Colm Kearney, Brian M. Lucey. 2012. Gravity and Culture in Foreign Portfolio Investment. SSRN Electronic Journal . [Crossref] 87. Pierre Collin-Dufresne, Michael Johannes, Lars A. Lochstoer. 2012. Parameter Learning in General Equilibrium: The Asset Pricing Implications. SSRN Electronic Journal . [Crossref] 88. Colm Kearney, Ciaran Mac an Bhaird, Brian M. Lucey. 2012. Culture and Capital Structure in Small and Medium Sized Firms. SSRN Electronic Journal . [Crossref] 89. Raj Aggarwal, Colm Kearney, Brian Lucey. 2011. Gravity and culture in foreign portfolio investment. Journal of Banking & Finance . [Crossref] 90. AurÉlien Baillon,, Olivier L'Haridon,, Laetitia Placido. 2011. Ambiguity Models and the Machina Paradoxes. American Economic Review 101:4, 1547-1560. [Abstract] [View PDF article] [PDF with links] 91. S. Cerreia-Vioglio, F. Maccheroni, M. Marinacci, L. Montrucchio. 2011. Uncertainty averse preferences. Journal of Economic Theory . [Crossref] 92. Lars Peter Hansen, Thomas J. Sargent. 2011. Robustness and ambiguity in continuous time. Journal of Economic Theory 146:3, 1195-1223. [Crossref] 93. Nabil I. Al-Najjar, Luciano de Castro. Subjective Probability . [Crossref] 94. 2011. Intertemporal substitution and recursive smooth ambiguity preferences. Theoretical Economics 6:3, 423-472. [Crossref] 95. J. Borovicka, L. P. Hansen, M. Hendricks, J. A. Scheinkman. 2011. Risk-Price Dynamics. Journal of Financial Econometrics 9:1, 3-65. [Crossref] 96. Patrick Augustin, Romeo Tedongap. 2011. Sovereign Credit Risk and Real Economic Shocks. SSRN Electronic Journal . [Crossref] 97. Hui Chen, Scott Joslin, Ngoc-Khanh Tran. 2011. Rare Disasters and Risk Sharing with Heterogeneous Beliefs. SSRN Electronic Journal . [Crossref] 98. Andras Fulop, Junye Li, Jun Yu. 2011. Bayesian Learning of Impacts of Self-Exciting Jumps in Returns and Volatility. SSRN Electronic Journal . [Crossref] 99. Roland Füss, Thomas Gehrig, Philipp B. Rindler. 2011. Scattered Trust - Did the 2007-08 Financial Crisis Change Risk Perceptions?. SSRN Electronic Journal . [Crossref] 100. Fabrice Collard, Sujoy Mukerji, Kevin Keith Sheppard, J.-Marc Tallon. 2011. Ambiguity and the Historical Equity Premium. SSRN Electronic Journal . [Crossref] 101. Sujoy Mukerji, Fabrice Collard, Kevin Keith Sheppard, J.-Marc Tallon. 2011. Ambiguity and the Historical Equity Premium. SSRN Electronic Journal . [Crossref] 102. Laurent E. Calvet, Veronika Czellar. 2011. State-Observation Sampling and the Econometrics of Learning Models. SSRN Electronic Journal . [Crossref] 103. Jaroslav Borovička, Lars Peter Hansen. 2011. Examining Macroeconomic Models Through the Lens of Asset Pricing. SSRN Electronic Journal . [Crossref] 104. Jaroslav Borovička, Lars Peter Hansen. 2011. Examining Macroeconomic Models Through the Lens of Asset Pricing. SSRN Electronic Journal . [Crossref] 105. Luca Benzoni, Pierre Collin-Dufresne, Robert S. Goldstein, Jean Helwege. 2011. Modeling Credit Contagion Via the Updating of Fragile Beliefs. SSRN Electronic Journal . [Crossref] 106. Luca Benzoni, Pierre Collin-Dufresne, Robert S. Goldstein, Jean Helwege. 2011. Modeling Credit Contagion Via the Updating of Fragile Beliefs. SSRN Electronic Journal . [Crossref] 107. Lubos Pastor, Robert F. Stambaugh. 2011. Are Stocks Really Less Volatile in the Long Run?. SSRN Electronic Journal . [Crossref] 108. Hui Chen, Nengjiu Ju, Jianjun Miao. 2011. Dynamic Asset Allocation with Ambiguous Return Predictability. SSRN Electronic Journal . [Crossref] 109. Jessica A. Wachter. 2010. Asset Allocation. Annual Review of Financial Economics 2:1, 175-206. [Crossref] 110. Larry G. Epstein, Martin Schneider. 2010. Ambiguity and Asset Markets. Annual Review of Financial Economics 2:1, 315-346. [Crossref] 111. Martin Lettau, Sydney C. Ludvigson. Measuring and Modeling Variation in the Risk-Return Trade- off 617-690. [Crossref] 112. Massimo Guidolin, Francesca Rinaldi. 2010. Ambiguity in Asset Pricing and Portfolio Choice: A Review of the Literature. SSRN Electronic Journal . [Crossref] 113. Pierre Collin-Dufresne, Robert S. Goldstein, Jean Helwege. 2010. Is Credit Event Risk Priced? Modeling Contagion Via the Updating of Beliefs. SSRN Electronic Journal . [Crossref] 114. Harjoat Singh Bhamra, Raman Uppal. 2010. Asset Prices with Heterogeneity in Preferences and Beliefs. SSRN Electronic Journal . [Crossref] 115. Jennie Bai. 2010. Equity Premium Predictions with Adaptive Macro Indexes. SSRN Electronic Journal . [Crossref] 116. Sujoy Mukerji. 2009. FOUNDATIONS OF AMBIGUITY AND ECONOMIC MODELLING. Economics and Philosophy 25:03, 297. [Crossref] 117. Bruce N. Lehmann. 2009. The role of beliefs in inference for rational expectations models☆. Journal of Econometrics 150:2, 322-331. [Crossref] 118. Timothy Cogley, Thomas J. Sargent. 2009. Diverse Beliefs, Survival and the Market Price of Risk. The Economic Journal 119:536, 354-376. [Crossref] 119. Lars Peter Hansen, Jaroslav Borovička, Mark Hendricks, Jose A. Scheinkman. 2009. Risk Price Dynamics. SSRN Electronic Journal . [Crossref] 120. Jochen Bigus. 2009. Auditors’ Liability with Ambiguity Aversion. SSRN Electronic Journal . [Crossref] 121. Nengjiu Ju, Jianjun Miao. 2009. Ambiguity, Learning, and Asset Returns. SSRN Electronic Journal . [Crossref] 122. Daehee Jeong, Hwagyun Kim, Joon Y. Park. 2009. Does Ambiguity Matter? Estimating Asset Pricing Models with a Multiple-Priors Recursive Utility. SSRN Electronic Journal . [Crossref] 123. Salvatore Modica. 2008. Unawareness, priors and posteriors. Decisions in Economics and Finance 31:2, 81-94. [Crossref] 124. Lars Peter Hansen, John C. Heaton, Nan Li. 2008. Consumption Strikes Back? Measuring Long‐ Run Risk. Journal of Political Economy 116:2, 260-302. [Crossref] 125. Thomas J. Sargent. 2008. Evolution and Intelligent Design. American Economic Review 98:1, 5-37. [Abstract] [View PDF article] [PDF with links] 126. Lars Peter Hansen. 2008. Modeling the Long Run: Valuation in Dynamic Stochastic Economies. SSRN Electronic Journal . [Crossref] 127. Stefano d'Addona, Frode Brevik. 2007. Information Processing with Recursive Utility: Some Intriguing Results. SSRN Electronic Journal . [Crossref]