<<

Statistical Science 2016, Vol. 31, No. 4, 511–515 DOI: 10.1214/16-STS570 Main article DOI: 10.1214/16-STS592 © Institute of Mathematical Statistics, 2016 Ambiguity Aversion and Model Misspecification: An Economic Perspective Lars Peter Hansen and Massimo Marinacci

1. INTRODUCTION of a set of possible ways for a model to be misspeci- fied, we may find that little can be said of value in con- In their paper, Watson and Holmes (2016) follow fronting the decision problem. The interplay between the statistical decision approach pioneered by Wald tractability and conceptual appeal is a central consider- (1950). Under Wald’s perspective, the aim of the analy- ation when producing tools that aid in statistical deci- sis shifts from the discovery of a statistical “truth” (for sion making. By using formal to frame instance, as revealed by the correct statistical model), and analyze misspecification, the advances described to making decisions that are defensible according to by Watson and Holmes (2016) help to nurture further posited objective functions that trade off alternative connections between statistics and , and they aims. We also draw on decision theory in our discus- allow for the incorporation of additional insights from sion because it provides a formal framework for con- statistics into economic analyses. fronting . In their essay, Watson and Holmes (2016) draw con- Decades ago, Arrow (1951) distinguished two nections to some research in economics. Our comment sources of uncertainty: (i) within a model,where will describe other important advances in decision the- the uncertainty is about the outcomes that emerge in ory within the economics discipline that are designed accordance to a (probability) model that specifies fully to confront uncertainty conceived broadly to include the outcome probabilities; and (ii) ambiguity among an aversion to ambiguity and a concern about model models, where the uncertainty is about which alter- misspecification. We will also delineate some special native model, or convex combination of such models, challenges for applications in economics. should be used to assign the probabilities. If the true Statisticians and econometricians use decision the- model is not assumed to be among the original set of ory to guide how they estimate parameters and assess models under consideration, a third source of uncer- the specification of alternative models. But economics and other social sciences also use decision theory for a tainty emerges, (iii) model misspecification, where un- second purpose. People inside the models that we build certainty is induced by the approximate nature of the must speculate about the future when making invest- models under consideration to use in assigning proba- ment and other forward-looking decisions. These peo- bilities. These different sources of uncertainty are in- ple inside the models could be individuals from the pri- herent to any analysis that includes decision makers vate sector of the market economy or they could be pol- who have (probabilistic) theories about the outcomes icy makers who employ econometricians to help them and form beliefs over their relevance. in making better decisions with either self-serving or How to accommodate potential model misspecifica- social objectives. It has long been understood within tion is a challenging topic. On the one hand, if we have economics that beliefs about the future are important very precise information about the nature of the mis- inputs into model construction, and this opens the door specification, then presumably we would fix or repair to letting people inside the economic models to use the model. On the other hand, if we allow for too large data and statistical methods to help shape their beliefs. Economists debate what degree of statistical sophisti- Lars Peter Hansen is the David Rockefeller Distinguished cation we should ascribe to the people inside our mod- Service Professor in Economics, Statistics and the College, els. Thus, there is a role for depicting decision making University of Chicago, Chicago, Illinois 60637, USA under uncertainty to help capture behavior inside the (e-mail: [email protected]). Massimo Marinacci models we build. Doing so can have substantial con- holds the AXA-Bocconi Chair in Risk, Department of sequences for outcomes of the economic analysis. Sta- Decision Sciences and IGIER, Università Bocconi, 20136 tistical challenges, thus, show up in two places. They Milano, Italy (e-mail: [email protected]). are present inside the models we build to capture the

511 512 L. P. HANSEN AND M. MARINACCI behavior of astute decision makers and outside these a risk function. Consistent with this label, we view the models when econometricians adapt and apply statis- integration over x conditioned on θ in equation (1)as tical methods to models that interest them. See, for adjusting for risk. instance, Hansen (2014) for more discussion of this When it comes to applying decision theories to eco- point. nomics and other fields, the unknown parameter θ may Statistical challenges inside the models we build in- be an intermediate target. For instance, consider a de- fluence how we shape and apply decision theoretic cision maker facing uncertainty captured by a future concerns about model misspecification. Moreover, dy- payoff relevant state. Represent this state as a ran- namics are central to many economic analyses. This dom vector S with realized values s in a set S.Let interest in dynamics has had a substantial impact in the f ∗(s|a,x,θ) denote the density relative to a mea- development of decision theory within economics at sure τ ∗ over alternative s’s in S conditioned on the least since Koopmans (1960) and also within control current period action a and observed data x. Con- theory. The interplay between concerns about model sider a next period utility function U ∗ that depends on misspecification in a dynamic, stochastic environment (s, a) and integrate over s to construct U(a,x,θ) = ∗ ∗ ∗ along with the desire for tractable recursive formu- S U (s, a)f (s|a,x,θ)τ (ds). Even when not di- lations has led to some recent advances in decision rectly payoff relevant, the θ dependence of U is in- theory within economics that we will briefly describe duced by the dependence of f ∗ on θ. This formulation [see, e.g., Gilboa and Marinacci (2013)andMarinacci is amenable to dynamic programming methods where (2015) for recent overviews]. value functions are incorporated into the specification of U and U ∗,andS is a future Markov state or a shock 2. DECISION THEORY that determines this Markov state given current infor- mation. Decision theory aims to describe how a person As posed so far, this representation of decision the- should behave in an uncertain environment. We fol- ory is incomplete. Following de Finetti (1937)and low Watson and Holmes (2016) in our use of decision Savage (1954), we include a subjective prior probabil- theory, except that we extend the notation in ways that ity π, and integrate over the posited θ. With this, we will support our discussion of potential model misspec- complete the specification: ification in dynamic environments.  For a statistician, decision theory typically guides a (2) U(A|θ)π(dθ) choice of a model or an estimator of an underlying pa-  rameter vector captured by the possibly infinite dimen- and can use this integral to rank alternative action sional parameter vector θ. Econometrics often adopts rules A. Since the action rule depends on x,given this same perspective. This is captured by an unknown π we may rank alternative actions a conditioned parameter vector θ that resides in a set .Givenθ, ∗ | ∗ on x by  U(a,x,θ)π (dθ x),whereπ is the fa- a random vector X with realized values x in a set X miliar Bayesian posterior π ∗(dθ|x) ∝ f(x|θ)π(dθ). is described by a probability model f(x|θ), a density A Bayesian statistician’s job is to construct the poste- relative to a measure τ over X that provides a proba- rior π ∗. bilistic specification of X. A decision maker observes As stated, however, in this specification there is a realization x and takes an action a ∈ A that can de- no obvious scope for the expression of an aversion pend on x. Formally, an action (or decision) rule is a to model misspecification or to model ambiguity. As suitably measurable function A : X → A. noted by Watson and Holmes (2016), both de Finetti Represent the decision maker’s preferences in terms and Savage acknowledge the challenge in using sub- of a utility function U(a,x,θ). Integrate over x to con- jective probability to address such aversions. struct expected utility conditioned on θ:    3. MISSPECIFICATION (1) U(A|θ)= U A(x),x,θ f(x|θ)τ(dx), X We share the Watson and Holmes (2016) interest which will be an important ingredient of our decision of exploring the impact of model misspecification. Is- theories. The expected utility, as we have computed sues that we discuss in this section are already relevant it, conditions on the parameter θ that is typically un- for uncertainty induced by model ambiguity, but con- known to the decision maker. Statistical decision the- cerns about model misspecification magnify their im- ory often regards −U as a loss function and calls −U portance potentially by expanding substantially the set COMMENT 513 of models under consideration. A rich set of extensions of priors. This suggests a direct connection between of decision theory has emerged to confront uncertainty the more special Gilboa and Schmeidler (1989)ap- broadly conceived. These advances include some men- proach and that of Maccheroni, Marinacci and Rus- tioned by Watson and Holmes (2016), but there are tichini (2006b) in many applications, including those also others. The alternative approaches alter the inputs explored by Hansen and Sargent (2001)andHansen into a Bayesian decision problem in a variety of ways. et al. (2006) and mentioned in Watson and Holmes Some follow Wald’s (1950) approach by relying on the (2016). From the perspective of the axiomatic analysis game theoretic analysis of von Neumann and Morgen- of Maccheroni, Marinacci and Rustichini (2006b)and stern (1944) to shape an approach to uncertainty. of dynamic implementation, there are important con- Watson and Holmes (2016) point to a rich literature ceptual distinctions, however. on robust Bayesian methods that explores prior sensi- There is a seemingly different approach to ambigu- tivity in systematic ways. Motivated by applications in ity about a prior that also features ambiguity aversion. economic dynamics and control theory, decision theo- Instead of penalizing or constraining a family of priors, rists have found valuable to adopt the following start- it introduces aversion to prior ambiguity in a way that ing point. Consider a convex function C to assess a de- is conceptually similar to by including a cision maker’s response to ambiguity about the prior π. strictly increasing concave function  as in the smooth The decision maker solves the following problem: ambiguity model of Klibanoff, Marinacci and Mukerji  (2005). The decision maker then solves the following (3) max min U(A|θ)π(dθ)+ C(π). A∈A π problem:   −   The cost function imposes a penalty on the choice of (4) max  1  U(A|θ) π(dθ) . prior. Penalization methods are well known in both A∈A  statistics and control theory. The preferences implicit While this problem is not posed as one with a con- in this decision problem are what Maccheroni, Mari- cern about prior sensitivity, the informativeness or lack nacci and Rustichini (2006b) call variational prefer- thereof in the prior does play a role in the decision cri- ences. Such preferences nest the multiple priors spec- terion through curvature in the function . As noted by ification of Gilboa and Schmeidler (1989), where the Hansen and Sargent (2007), for the familiar and com- cost function C takes on the extreme form of being monly used relative entropy formulation, there is a sim- equal to infinity if π is outside a convex set of pri- ple connection between the penalization approach to ors  and zero inside. It extends the usual maximin prior sensitivity and the smooth ambiguity approach. approach to accommodate penalization, thus includ- Let πo denote a reference prior and g denote a proba- ing formulations with a reference prior and a relative bility density with respect to πo.IfG denotes the family entropy penalty as proposed by Hansen and Sargent of such densities, then (2001). It accommodates robust Bayesian analysis in  its systematic exploration of prior sensitivity with the min U(A|θ)g(θ)πo(dθ) g∈G  use of a utility or loss function. The choice to minimize  represents the aversion to ambiguity over the selection + λ g(θ)log g(θ)πo(dθ) of the prior or a concern about prior misspecification.   We may think of Problem (3) as a zero-sum game. 1 When the order of extremization can be reversed with- =−λ log exp − U(A|θ) πo(dθ). out altering the objective, then we may obtain a so-  λ called worst-case prior under which the decision maker This minimization problem is essentially a special case optimizes by taking this prior as given. Bayesians such of an optimization problem with a relative entropy as Good (1952) argue for assessing the plausibility penalty that emerges in a variety of areas of applied of this (restrained) worst-case prior. While this forges mathematics and essentially follows from Theorem 4.1 a link to Bayesian decision theory, notice that this in Watson and Holmes (2016). Thus, a particular form “choice” of prior depends on the utility or loss func- of a smooth ambiguity model emerges from a search tion rather than on a subjective introspection. over alternative prior densities subject to a penaliza- The theory of constrained optimization uses La- tion. grange or Kuhn–Tucker multipliers as a convenient We close this section by observing that decision way to enforce constraints, say on a convex family theories that extend (2) also violate the sure-thing 514 L. P. HANSEN AND M. MARINACCI principle, which has been viewed as a basic norma- models or priors over models into a potentially larger tive principle. However, the appeal of this princi- rectangular set based on considerations of dynamic ple becomes questionable under model uncertainty as consistency. Such an approach does not always yield Ellsberg (1961), pages 653–654, famously illustrated interesting answers, however. Suppose we use dynamic with the three-color paradox. versions of statistical discrepancies, including the ones used by Watson and Holmes (2016), to constrain ways 4. DYNAMICS in which models could be misspecified. The resulting rectangular embeddings are so large as to lead to de- Incorporating dynamics raises a host of interesting generate outcomes [see Hansen and Sargent (2016)for questions and has nurtured the development of recur- a formal discussion]. This phenomenon has led Hansen sive formulations of decision problems. By design, et al. (2006) to seek recursive implementations of so- these recursive formulations are amenable to applica- ex ante tion of dynamic programming methods which render called commitment problems solved from an the computation and characterization of solutions to perspective, much like what often occurs in control economics models tractable. There are a variety of con- theory as reflected say in Petersen, James and Dupuis ceptual changes once we entertain model misspecifica- (2000). The added flexibility of the dynamic version tion. of variational preferences characterized Maccheroni, Marinacci and Rustichini (2006a) gives an alternative 4.1 Misspecification in Future Dynamics attractive way to support the use of statistical discrep- Connecting the future to the past is central to ap- ancy measures as noted originally by Hansen and Sar- plication of time series statistical methods. So far, we gent (2001) and developed more fully in Hansen et al. have presumed that the density f ∗ for the future state, (2006). as well as the density f used to represent the observed 4.3 Robustness and Dynamic Learning data, depend on an underlying parameter θ.Thisgives us the opportunity to learn using Bayesian updating, In dynamic settings, yesterday’s posterior is today’s even though potential model misspecification can im- prior. Bayes’ rule is a wonderful and often conve- pede this process. But if past data are only revealing nient recursion. Even under potential misspecification, for some of the components of θ, there may remain guesses about the future are typically tied to past ev- uncertainty about how the economic environment will idence. Robust learning and its impact on decisions evolve in the future that is not fully captured by ran- provides a fascinating twist to the direct Bayesian ap- dom shocks. This type of uncertainty motivated the proach. There are a variety of approaches that have work of Hansen and Sargent (2001), Chen and Epstein been suggested, including direct application of Bayes’ (2002)andAnderson, Hansen and Sargent (2003). It rule applied to a reference prior over models with an is also a central motivation for the dynamic extension accompanying adjustment to the recursively generated of the class of variational preferences as discussed in posteriors [see, e.g., Hansen and Sargent (2007)and Maccheroni, Marinacci and Rustichini (2006a). Klibanoff, Marinacci and Mukerji (2009)]. Alterna- tively, Epstein and Schneider (2003) suggest a prior- 4.2 Dynamic Consistency by-prior application of Bayes’ rule in a dynamic mul- Dynamic applications of decision making under un- tiple priors models with a rectangular embedding to certainty, often, but not always, look for formula- enforce consistency. In a model with date zero prior tions that are dynamically consistent to avoid hav- ambiguity, Chamberlain (2000) embraces the prior-by- ing decision makers play naive or sophisticated games prior application of Bayes’ rule from an ex ante per- against future versions of themselves. This aim can spective without including a rectangular embedding. shape the formulation of decision problems. Indeed There remains scope for further dialog and discus- the penalization approach suggested by Hansen and sion of this important topic as there are potentially Sargent (2001) and the generalization developed by important tradeoffs between conceptual appeal and Maccheroni, Marinacci and Rustichini (2006a)were computational tractability. We very much hope that the motivatedinpartbysuchconcerns. essay by Watson and Holmes (2016) and our discus- Dynamic extensions of the multiple priors model of sion will encourage further synergistic research linking Gilboa and Schmeidler (1989)ledEpstein and Schnei- statistical challenges to economic model building and der (2003) to embed a subjectively specified set of analysis. COMMENT 515

ACKNOWLEDGEMENTS HANSEN,L.P.andSARGENT, T. J. (2007). Recursive robust es- timation and control without commitment. J. Econom. Theory We thank Pierpaolo Battigalli, Thomas Sargent, 136 1–27. MR2888400 Stephen Stigler and an anonymous referee for helpful HANSEN,L.P.andSARGENT, T. J. (2016). Sets of models and comments. prices of uncertainty. Working Paper No. 22000. HANSEN,L.P.,SARGENT,T.J.,TURMUHAMBETOVA,G.and WILLIAMS, N. (2006). Robust control and model misspecifica- REFERENCES tion. J. Econom. Theory 128 45–90. MR2229772 ANDERSON,E.,HANSEN,L.P.andSARGENT, T. J. (2003). KLIBANOFF,P.,MARINACCI,M.andMUKERJI, S. (2005). A quartet of semigroups for model specification, robustness, A smooth model of decision making under ambiguity. Econo- prices of risk, and model detection. J. Eur. Econ. Assoc. 1 68– metrica 73 1849–1892. MR2171327 123. KLIBANOFF,P.,MARINACCI,M.andMUKERJI, S. (2009). Re- ARROW, K. J. (1951). Alternative approaches to the theory of cursive smooth ambiguity preferences. J. Econom. Theory 144 choice in risk-taking situtations. Econometrica 19 404–437. 930–976. MR2887278 MR0046020 KOOPMANS, T. C. (1960). Stationary ordinal utility and impa- CHAMBERLAIN, G. (2000). Econometric applications of maxmin tience. Econometrica 28 287–309. MR0127422 expected utility. J. Appl. Econometrics 15 625–644. MACCHERONI,F.,MARINACCI,M.andRUSTICHINI,A. CHEN,Z.andEPSTEIN, L. (2002). Ambiguity, risk, and as- (2006a). Dynamic variational preferences. J. Econom. Theory set returns in continuous time. Econometrica 70 1403–1443. 128 4–44. MR2229771 MR1929974 MACCHERONI,F.,MARINACCI,M.andRUSTICHINI,A. DE FINETTI, B. (1937). La prévision: Ses lois logiques, ses sources (2006b). Ambiguity aversion, robustness, and the variational subjectives. Ann. Inst. Henri Poincaré 7 1–68. MR1508036 representation of preferences. Econometrica 74 1447–1498. ELLSBERG, D. (1961). Risk, ambiguity, and the Savage axioms. Q. MR2268407 J. Econ. 75 643–669. MARINACCI, M. (2015). Model uncertainty. J. Eur. Econ. Assoc. EPSTEIN,L.G.andSCHNEIDER, M. (2003). Recursive multiple- 13 998–1076. priors. J. Econom. Theory 113 1–31. MR2017864 PETERSEN,I.R.,JAMES,M.R.andDUPUIS, P. (2000). GILBOA,I.andMARINACCI, M. (2013). Ambiguity and the optimal control of stochastic uncertain systems with relative en- Bayesian Paradigm, in Advances in Economics and Economet- tropy constraints. IEEE Trans. Automat. Control 45 398–412. rics: Theory and Applications (D. ACEMOGLU,M.ARELLANO MR1762853 and E. DEKEL, eds.). Cambridge Univ. Press, Cambridge. SAVAG E , L. J. (1954). The Foundations of Statistics. Wiley, New GILBOA,I.andSCHMEIDLER, D. (1989). Maxmin expected utility with nonunique prior. J. Math. Econom. 18 141–153. York. MR0063582 MR1000102 VON NEUMANN,J.andMORGENSTERN, O. (1944). Theory of GOOD, I. J. (1952). Rational decisions. J. Roy. Statist. Soc. Ser. B. Games and Economic Behavior. Princeton Univ. Press, Prince- 14 107–114. MR0077033 ton, NJ. MR0011937 HANSEN, L. (2014). Nobel lecture: Uncertainty outside and inside WALD, A. (1950). Statistical Decision Functions. Wiley, New economic models. J. Polit. Econ. 122 945–987. York. MR0036976 HANSEN,L.P.andSARGENT, T. J. (2001). Robust control and WATSON,J.andHOLMES, C. (2016). Approximate models and model uncertainty. Am. Econ. Rev. 91 60–66. robust decisions. Statist. Sci. 31 465–589.