Quick viewing(Text Mode)

Bounded Rationality and Directed Cognition

Bounded Rationality and Directed Cognition

Bounded Rationality and Directed Cognition

Xavier Gabaix David Laibson MIT and NBER∗ October 15, 2000 Preliminary and incomplete.

Abstract This paper proposes a psychologically sensible bounded rationality model which can be operationalized. The model formalizes the well- known idea that cognitive resources are scarce and should be econo- mized. The model makes quantitative predictions and can be real- istically applied to decision problems that require an arbitrarily large number of cognitive state variables. We test the model using exper- imental data, and find that it significantly outperforms the rational actor model with zero cognition costs.

JEL classification: C70, C91, D80 Keywords: directed cognition, optimization, bounded rationality, heuris- tics, biases, experimental economics.

∗Gabaix: Department of Economics, MIT, Cambridge, MA 02142, [email protected]. Laibson: Department of Economics, Harvard University, and NBER, Cambridge MA 02138, [email protected]. We are grateful for the advice of , Abhijit Banerjee, , Glenn Ellison, Drew Fu- denberg, Lars Hansen, , Robert Lucas, , Andrei Shleifer, Al Roth, , Martin Weitzman, and Richard Zeckhauser. Our paper benefitted from the comments of seminar participants at the University of Chicago and Harvard University, and the University of Michigan. Numerous research assistants helped implement and analyze the experiment; we are particularly indebted to Guillermo Moloche and Stephen Weinberg. Laibson acknowledges financial support from the National Science Foundation (SBR-9510985), the National Institutes of Health (XXX), and the MacArthur Foundation.

1 1 Introduction

Cognitive resources should be allocated just like other scarce resources. Consider the problem of a chess player who must decide what move to make. This chess player faces a tradeoff between his desire to choose well, and his desire to economize on decision time. The player should cut corners when analyzing his options, ignoring certain lines of play, even if there is a chance that he will consequently overlook some critical element of the game. Similar issues arise in almost every decision task. Where should I go to col- lege? Which job offer should I take? Which house should I buy? In all of these problems, the decision-maker should not insist on identifying the best possible selection. Rather, he should make an educated decision and hope for the best. Often the decision-maker must make educated guesses, say when time constraints preclude a complete analysis of the available options. All economists are familiar with the arguments of the previous para- graph. Since the work of Herbert Simon (1955), economists have recognized that cognition/attention is costly and the efficient allocation of cognition is a critical element of human decision-making. This paper moves beyond such observations by providing an operational theory of cognition, which makes sharp quantitative predictions that can be empirically tested. We provide such a test in this paper. Our model has three elements: a set of cognitive operations, a set of state variables that encode the current state of beliefs, and a value function. Cognitive operations include the different categories of analysis that we use to deepen our understanding of a given problem. For example, consider the selection of an undergraduate college. Thinking more deeply about a particular college is a cognitive operation that will refine one’s estimate of the value of attending that college. For the college choice problem, we could define a long list of cognitive operations (e.g., thinking about Boston Uni- versity, thinking about Brown University, thinking about Cornell University, etc...). Cognitive state variables encode the current state of the decision-maker’s beliefs. For example, cognitive state variables will capture one’s expecta- tions about the value of each of the colleges in one’s choice set. Naturally, one’s expectations about any given college will be revised if one devotes incremental thought to that college. Cognitive state variables will also cap- ture one’s sense of how thoroughly each college has been considered. After dedicating hours of thought to Brown and no thought to other schools, the decision-makerwillknowthatshehaslesstolearnaboutBrownthanCor- nell.

2 The value function is used to evaluate the expected benefitofeachcog- nitive operation. For example, should the decision-maker invest additional time thinking about Boston University or Cornell? Thinking incrementally about Cornell may substantially deepen one’s knowledge about Cornell, but this information will not be useful if the decision-maker has a strong pref- erence for an urban lifestyle. There is little use thinking about an option that is unlikely to be chosen at the end of the day. The value function en- ables the decision maker to reason in such an option-theoretic way, directing thought in the most productive directions. Our value function approach uses the standard dynamic programming toolbox, but our approach does not assume that the decision-maker knows the optimal value function. Instead, we propose a modelling strategy which uses ‘proxy value functions’ which provide good approximations of the op- timal value function. Just like optimal value functions, these proxy value functions encode knowledge states and evaluate the usefulness of potential cognitive operations. We argue that proxy value functions are method- ologically superior to optimal value functions because proxy value functions generate similar qualitative and quantitative predictions but are easy to work with. By contrast, optimal value functions can not be calculated for realistically complex bounded rationality problems. To model complex decisions, we propose a three-step search-theoretic decision algorithm. First, the algorithm evaluates the expected benefitof various cognitive operations. The expected benefitisrelatedtothepre- dicted likelihood that a cognitive operation will reveal information that will change the decision-maker’s planned action. Second, the algorithm exe- cutes the cognitive operation with the greatest expected benefit. Third, the algorithm repeatedly cycles through these first two steps, stopping when the cognitive costs of analysis outweigh the expected benefits. We call this the directed cognition model. This paper shows how to practically operational- ize this framework for complex decision problems with any number of state variables. To demonstrate the operation of our approach, we analyze a one-player decision problem. We then compare the results of this analysis to the deci- sions of experimental subjects. In this application, we effectivly use hun- dreds of continuous state variables. Nevertheless, such an application is easy to execute with our proxy value functions, though it is impossible to execute with optimal value functions. We believe that our model realizes several goals. First, our model is psy- chologically plausible because it is based on the actual decision algorithms that subjects claim to use. Second, the model is broadly applicable, because

3 it can be used to analyze a wide range of decision problems including se- quential games. Third the model generates precise quantitative predictions and can be empirically tested; such tests are provided in this paper using data from an experiment on Harvard undergraduates. The data rejects the rational model (i.e., zero cognition cost) in favor of our bounded rationality alternative. Our approach combines numerous strands in the existing literature on bounded rationality. First, our conceptual approach extends the satisficing literature which was pioneered by Herbert Simon (1955) and subsequently formalized with models of deliberation costs by Conlisk (1996), Payne et al (1993) and others. Second, our approach extends the heuristics litera- ture started by and Amos Tversky (1974) and Richard Thaler (1991). Third, our approach is motivated by experimental work by Colin Camerer et al (1994) and theoretical work by Philippe Jehiel (1995) that argues that decision-makers may solve problems by looking forward, rather than using backward induction. Fourth, our emphasis on experi- mental evaluation is motivated by the work of Ido Erev and Alvin Roth (1998) and Colin Camerer and Teck-Hua Ho (1999) who develop and then econometrically test learning models. In section 2 we describe our theoretical approach and motivate our use of proxy value functions. In section 3 we introduce a class of one-player decision trees and informally describe the way in which our model applies to this class of problems. In section 4 we provide a detailed description of this application. In section 5 we describe other algorithms that could be used to approximate human analysis of these decision trees. In section 6 we describe an experiment in which we asked undergraduates to solve these decision trees. In section 7 we compare the predictions of the models to the experimental results. In section 8 we discuss other implications and extensions of our framework. In section 9 we conclude.

2 The directed cognition approach to bounded ra- tionality

2.1 A simple model of choice Consumers routinely choose between multiple goods. We begin with the simplest version of such a decision problem. The consumer must choose either good B or good C. Thevalueofthetwogoodsisnotimmediately known, though careful thought will better reveal their respective values.

4 For example, two detailed job offers may be hard to immediately compare. It will be necessary to thoughtfully consider all of the features of the of- fers (e.g., health plan, pension benefits, vesting rules, maternity leave, sick days, etc....). After such consideration, the potential employee will be in a better position to compare the two options. Similar issues arise whenever a consumer chooses from a set of goods. Having all of the information is not enough. Thinking about the information is also necessary for a smart choice. We build a simple bounded rationality model that describes such thinking. To analyze this problem, it is helpful to note that a decision between uncertain goods B and C is equivalent mathematically to a decision between a sure thing with payoff zero and good A, where A0s payoff is the difference between the payoffs of goods B and C. Inthissection,weworkwiththe0 vs. A problem because it simplifies notation. Let at represent the agent’s expectations about the payoff of good A at time t. We assume that thinking about A leads to a change in the expecta- tions,

dat = σ(t)dz. (1)

Here dz represents an independent Brownian increment, and captures knowl- edge gained from introspection. Finally, we assume that thinking generates instantaneous shadow cost q. We can now formally state the agent’s problem. The agent picks a time, T , at which to stop thinking such that she maximizes

E [max aT , 0 qT] (2) { } − Note that the stopping time need not be fixedinadvance,butcanbecon- tinuously updated as the consumer learns more about the likely value of A. Finally, it is important to emphasize that the state variable in this problem is a, the agent’s expectation about the state of the world. Throughout the analysis that follows, state variables will represent the state of knowledge of the decision-maker.

2.2 Rational bounded rationality: divine cognition. Suppose our decision-maker can costlessly and instantaneously think about the solution to the problem posed in equations (1) and (2). This assumption is internally inconsistent, because this costless and instantaneous thinking only applies to the solution of the maximization problem (eqs. (1)-(2)) and

5 not to the underlying analysis of good A, where thinking is assumed to impose shadow cost q. Nevertheless, this internally inconsistent assumption is an interesting benchmark, and it is the only way forward for a mainstream economist whose only behavioral model is maximization. We characterize the results of this ‘rational’ bounded rationality approach in Appendix A assuming ei- ther that σ(t) is constant or that σ(t) is affine and decreasing. This analysis is standard, using continuous-time Bellman Equations and boundary con- ditions to solve for value functions and optimal policies. We refer to such rational bounded rationality as divine cognition, since the boundedly ratio- nal decision-maker (who pays cost q to think about A)miraculouslyand costlessly knows the optimal solution to the problem of how long to think about A. In other words, divine cognition assumes that thinking about A is costly but that thinking about thinking about A is costless.

2.3 The problem with divine cognition The divine cognition model has two clear advantages. It makes sharp predictions and it is highly parsimonious. But, it has one overwhelming disadvantage. Realistically, it can only be applied to a small set of situa- tions. Generic generalizations of our decision problem are not analytically tractable.1 Forexample,wecannotletthenumberofgoodsgofromtwo to N. We can not consider σ functions that are more general than the constant and affine cases. We can not consider the empirically sensible case in which σ depends not only on total time that has transpired, but also on the specific allocation of time among different cognitive activities: σ(tf ,tg,th, ...). We can not let σ vary with the particular information that was recovered during those previous searches: σ(tf , θf ,tg, θg,th, θh,...). We can not consider the possibility that learning about good i also gives me some information about good j (i.e., covariances in information revelation). These are just a few of the realistic attributes of real world problems. At the moment economic research is hamstrung by its reliance on pure maximiza- tion. Economists can’t evaluate many real-world problems because we can’t 1 A strand of the operations research literature is devoted to finding particular classes of problems whose solutions have a simple characterization. One of them uses Gittins indices (Gittins 1979), basically a means to decouple n choice problems. The Gittins solution applies only to setups with stationary environments (such as multi-arm bandit problems), and in particular does not apply (as stressed by Gittins) to finite horizon problems. See Weitzman (1979) and Roberts and Weitzman (1980) for related techniques that solve special cases of the divine cognition problem. We consider problems that do not fall within this class of special cases.

6 solvefortheoptimalvaluefunctionsthatcharacterizethoseproblems. We specialize in building toy models that deliver intuition. But, we might be more scientific if we could build practical models that make quantitatively meaningful predictions that can be tested. Moreover, the constraints that economists face are not just momentary difficulties that will pass once we have a new generation of computers or better knowledge of continuous-time maximization techniques. Many real- world problems have far too many state variables to be amenable to our stan- dard methods. Thinking about N goods with a single constant and uniform variance implies N state variables. Thinking about N goods with history- dependent variances, generates 2N state variables. Thinking about N goods 1 with history dependent variances and covariances generates 2 N(N +3) state variables. Thinking about N goods with M attributes with history depen- 1 dent variances and covariances generates 2 MN(MN +3) state variables. And so on. Such multidimensional problems lie far, far beyond our current capabilities. At the moment, sophisticated numerical methods can solve generic problems with as many as ten continuous state variables. Imagine that you are trying to model the boundedly rational thought process of a college student who is thinking about a residential choice. The student picks among 13 dormitories and 4 off-campus apartments each of which has 8 attributes. For the last case discussed above, the divine cogni- tion model requires that the researcher solve a dynamic programming prob- lem with 9452 continuous state variables. This example is extreme. Perhaps it is not necessary to model the covariances. Perhaps half of the attributes can be ignored because they are sufficiently minor. Perhaps all of the off-campus housing options can be ignored because few students ultimately choose them. In the stripped down problem, the student chooses among 13 dormitories each of which has 4 attributes. With history dependent variances and no covariances, this problem still generates an unworkable number of state variables: 104. Some economists might argue that we clearly have our work cut out for us. To study bounded rationality (and the rich cognitive states that accompany such problems), we should get busy learning how to solve high- dimension dynamic programming problems. We suggest that another ap- proach may be more fruitful. We motivate this alternative approach with two observations. First, as we have already noted, the divine cognition approach is not practically feasible. Second, even if it were feasible, it seems likely that the mind- boggling complexity of the resulting analysis should make us at least a little skeptical that it is a good representation of real-world decision-making.

7 2.4 Directed cognition We suggest an alternative framework that can handle arbitrarily complex problems, but is nevertheless easy to implement. Moreover, this alternative is parsimonious and makes precise quantitative predictions. We call this approach directed cognition. Specifically, we propose the use of bounded rationality models in which decision-makers use proxy value functions that are not fully forward looking. We make this approach precise below and characterize the proxy value functions. This approach is particularly apt for bounded rationality applications, since modelling cognitive processes neces- sarily requires a large number of cognitive state variables. Memories and beliefs are all state variables in decisions about cognition. Reconsider the boundedly rational decision problem described above. The consumer picks a stopping time, T, to maximize

E [max aT , 0 qT] { } − where

dat = σ(t)dzt

If the consumer had access to a proxy value function, say Vˆ (a, t),the consumer would continue thinking as long as she expected a net improvement in expected value. Formally, the agent will continue thinking if

max Et Vˆ (at+τ ,t+ τ) Vˆ (at,t) qτ > 0. (3) τ R+ − − ∈ h i We could assume that the agent continues thinking for duration τ ∗ before calculating a new τ ∗0. Or we could assume that she continuously updates her evaluation of equation (3), stopping thinking only when τ ∗ =0. In this section, we make the latter assumption for tractability. We now turn to the selection of the proxy value function. In this paper we consider the simplest possible candidate (for the choice between a and 0)

Vˆ (a, t)=max a, 0 . (4) { } Equation (4) implies that the decision-maker holds partially myopic beliefs, since she fails to recognize option values beyond the current thinking horizon of τ ∗.

8 Note that this framework can be easily generalized to handle problems of arbitrary complexity. For example, if the agent were thinking about N goods, then her proxy value function would be,

Vˆ (a1,a2, ..., aN ,t)=max a1,a2, ..., aN . t t t { t t t } We now consider this general N good problem and analyze the behavior that it predicts. − Suppose that thinking for interval τ about good Ai transforms one’s i i expectation from at to at + η(i, τ,t), where η(i, τ,t) is a mean zero random i j variable. Let at− =maxj=i at . Then the expression 6 1 i N 1 2 N Et Vˆ (a ,...,a + η(i, τ,t), ..., a ,t+ τ) Vˆ (a ,a , ..., a ,t) (5) t t t − t t t h i can be rewritten as,

i i i i E max(a + η(i, τ,t),a− ) max(a ,a− ) t t − t t £ ¤ i i + i i + = E (a + η(i, τ,t) a− ) (a + η(i, τ,t) a− ) , t − − t − where x+ =max£ 0,x . Let σ(i, τ,t)¤represent the standard deviation of η(i, τ,t), and define{ u such} that η(i, τ,t)=σ(i, τ,t)u. We assume that such a random variable u exists and that it is symmetrical around 0. Then the cognition decision reduces to

i i (i∗, τ ∗) arg max w(at at− , σ(i, τ,t)) qτ. (6) ≡ i,τ − − where

w (a, σ)=v(a, σ) a+ = v ( a , σ) , (7) − − | | a a v(a, σ)=aΦ + σφ . (8) σ σ ³ ´ ³ ´ For the case in which thinking generates normally distributed increments to knowledge, we can write, a a w(a, σ)= a Φ | | + σφ . (9) −| | − σ σ µ ¶ ³ ´ Our boundedly rational agents won’t necessarily use the “true” distribu- tion of the information shocks u. Our agent may use simplified versions of it.

9 1 Hence, we also consider the symmetric distribution with atoms at u = 2 . For this special case, which we denote with the superscript 0, we get, ±

1 (a + σ) if a σ v0(a, σ)= 2 a+ if |a| ≤> σ ½ | | so that 1 w0(a, σ)= (σ a )+. 2 − | | The simple, linear form above makes w0 convenient to use.

2.5 Comparing directed cognition to divine cognition We are now in a position to compare the predictions of the two models. Thinking for τ more minutes will reveal an amount of information of stan- dard deviation:

t+τ 1/2 σ(t, τ)= σ2(s)ds µZt ¶ The strategy in the directed cognition model is to continue thinking iff there is a τ > 0 such that

w (at, σ(t, τ)) qτ > 0 −

This leads to the policy “continue iff at < at”, where at is implicitly defined by | |

sup w (at, σ(t, τ)) qτ =0 (10) τ 0 − ≥ For the standard case of the directed cognition model (i.e., normally distrib- uted information innovations), at is defined by

at at sup atΦ + σφ qτ =0. τ 0 − −σ(t, τ) σ(t, τ) − ≥ µ ¶ µ ¶

For the case with atoms of information at 1, at is defined by ± 1 + sup (σ(t, τ) at ) qτ =0 τ 0 2 − | | − ≥

10 So,

at =supσ(t, τ) 2qτ. τ 0 − ≥ We now consider the special case in which σ2(t)=σ2, so σ2(t, τ)=τσ2. We call this the constant variance case. Appendix A also considers the case in which the variance falls linearly with time until it hits zero. For the constant variance case the value function is independent of t,

0 if a a 2 ≤− V (a)= α( 1 α) σ + a + q a2 if a

0.35

0.3

0.25

0.2

0.15

0.1

0.05

-0.3 -0.2 -0.1 0.1 0.2 0.3 Figure 1 — Value functions.

We can also calculate the expected additional thinking time until stop- ping,

0 if a a T (a)=E[T a]= | | ≥ . (11) | a2/σ2 + α2σ2/q2 if a < a ½ − | | This derivation is reported in Appendix A.

11 Finally, we can also calculate the expected consumption utility,

0 if a a + 1 2 ≤− C(a)=E[aτ a]= 2 ασ + aq if a

Proposition 1 Let V (x) represent the value function under the optimal policy (divine cognition). Suppose the variance function, σ ( )=(σ (t)) , · t 0 is deterministic and weakly decreasing.2 Then, ≥

S0 x, 2/πσ ( ) S1 (x, σ ( )) V 1 (x, σ ( )) V (x, σ ( )) S2 (x, σ ( )) ... · ≤ · ≤ · ≤ · ≤ · ≤ ³ p ...´ S0 (x, σ ( )) + min( x ,S0(0, σ ( )))/2 (12) ≤ · | | · where Si is the subjective value function,

t 1/2 Si(x, σ ( )) = sup wi x, σ2(s)ds qt, · t R+ 0 − ∈ Ã µZ ¶ ! w0 and w1 are as before in the text,

w2(x, s)= x2 + s2 x /2, − | | ³p ´ and V 1 is the objective value function induced by directed cognition with normal innovations.

i + + + + i Remark 2 One can always write w (x, s)=E (x + su) x = ∞ ( x + su) f (u)du − − | | for some density f i. Recall, −∞ £ ¤ R f 0(u)=(δ(u + 1)+δ(u 1)) /2 (i.e. u = 1 with equal probability) − ± 1 1 u2/2 f (u)= e− (i.e. u is normally distributed) √2π 2 The nonincreasing σ implies that “high payoff” cognitive operations are done first. Hence, thinking displays decreasing returns over time.

12 For the new case, i =2, we have the density, 1 f 2(u)= , 2(1 + u2)3/2 which has finite mean but infinite variance.

Proposition 1 is proved in Appendix B. The Proposition is stated in terms of “subjective value functions”, which bound the objective value functions. Such subjective value functions rep- resent the expected utility outcomes under the mistaken assumption that the current cognitive operation (with incremental stopping time τ)isthefi- nal cognitive operation. The subjective value functions are therefore based on static beliefs, which presume that there will be no future cognitive op- erations. Such subjective value functions are relatively easy to calculate because they assume deterministic stopping times. By contrast, optimal value functions incorporate optimally chosen stopping times that depend on future realizations of random variables. Proposition 1 implies that the objective value functions for divine cogni- tion (V ) and directed cognition (V 1) are bounded below, by S1 and above by S2.S1 represents the subjective value function associated with normal innovations. S2 represents the subjective value function associated with the f 2(u) density. It is useful to evaluate the bounds at the point x =0,asthisiswhere the option value V (x) x+ is greatest3. V (0, σ ( )) gives the order of − · 1/2 t 2 0 magnitude of the option value. With σ(0,t):= 0 σ (s)ds , the S bounds of Proposition 1 imply, ³R ´ S0 0, 2/πσ ( ) V 1 (0, σ ( )) V (0, σ ( )) S0 (0, σ ( )) , (13) · ≤ · ≤ · ≤ · ³ ´ or, p 1 1 sup σ(0,t) qt V 1 (0, σ ( )) V (0, σ ( )) sup σ(0,t) qt. t R+ √2π − ≤ · ≤ · ≤ t R+ 2 − ∈ ∈

The two objective value functions, V 1 and V , are bounded below by the subjective value function associated with the w0-strategy with σ multiplied by factor 2/π .80. V 1 and V are bounded above by the subjective ' 3 This canp easily be proved. [...]

13 value function associated with the w0-strategy. So the difference between V 1 and V is bounded by the difference between subjective value functions generated by standard deviations 2/πσ .8σ and σ.Inthissense,the two value functions are relatively close. ≈ p Proposition 1 has a corollary that reinforces these quantitative interpre- tations. Corollary 3

V x, 2/πσ ( ) min( x ,S0(0, 2/πσ ( )))/2 S0 x, 2/πσ ( ) · − | | · ≤ · ³ p ´ p ³ p ´(14) Evaluating the corollary at x =0implies,

V 0, 2/πσ ( ) V 1 (0, σ ( )) V (0, σ ( )) (15) · ≤ · ≤ · So the difference³ betweenp V 1 and´ V is bounded by the difference between objective value functions generated by standard deviations 2/πσ .8σ and σ. Note that the factor 2/π is the same whether we use the≈ S0 p bounds (above) or the V bounds. p Comparative static analysis also implies that our directed cognition model is “close” to the divine cognition model. The subjective and objective value functions are all increasing in σ( ) and decreasing in q. They also share other properties. All of the value functions· tend to x+ as x ;risewithx; have slope 1/2 at 0; and are symmetric around 0 (e.g.,| |V→∞(x) x+ is even). This section has described our directed cognition model.− We were able to compare it to the divine cognition model for a special case in which the decision-maker chooses between A and 0.4 For this extremely simple exam- ple, the divine cognition approach is well-defined and solvable in practice. In most problems of interest, the divine cognition model will not be of any use, since economists can’t solve dynamic programming problems with a sufficiently large number of state variables. Below, we turn to an example of such a problem. Specifically, this problem involves the analysis of 10 10 (or 10 5) decision trees. Compared to real-world decision problems, such× trees are× not particularly complex. However, even simple decision trees like these generate hundreds of cognitive state variables. Divine cognition is useless here. But directed cognition, with its proxy value function, can be applied without difficulty. We turn now to this application. 4 All of the bounding results in this subsection assume that the decision-maker selects among two options (n =2). We can show that the lower bounds that we derive generalize to the case n 3. Obtaining analogous upper bounds for the general case remains an open question.≥

14 3 Decision trees and an informal algorithm

We use the directed cognition model to predict how experimental subjects will make choices in decision trees. We analyze decision trees, since trees can be used to represent a wide class of problems. We describe our decision trees at the beginning of this section, and then apply the model to the analysis of these trees. At the end of the paper, we discuss how the model can be generalized to analyze sequential games of complete information.

Consider the decision tree in Figure 2, which is one of twelve randomly generated trees that we asked experimental subjects to analyze. Each start- ing box in the left-hand column leads probabilistically to boxes in the sec- ond column. Branches connect boxes and each branch is associated with a given probability. For example, the first box in the last row contains a 5. From this box there are two branches with respective probabilities .65 and .35. The numbers inside the boxes represent flow payoffs. Start- ing from the last row, there exist 7,776 outcome paths. For example, the outcome path that follows the highest probability branch at each node is 5, 4, 4, 3, 1, 2, 5, 5, 3, 4 . Integrating with appropriate probability weights overh all 7,776− paths,− the expectedi payoff of starting from row five is 4.12. We asked undergraduate subjects to choose one of the boxes in the first column of Figure 2. We told the subjects that the 31 -1.22o2(u)1.1(r)124(d)-433.7(th)11(o)0 sp.2(t)-4.5(g)1133.7(th)12a subsection, before providing a mathematically precise description in the next subsection. Our application is built on a single basic cognitive operation: extending a partially examined path one column deeper into the tree. Such extensions enable the decision-maker to improve her forecast of the expected value of any given starting row. For example, consider again the last row of Figure 2. Imagine that a decision-maker is trying to approximate the expected value of that starting row, and that he has calculated only the expected value of the two truncated paths leading from column 1 to column 2. Hence, the estimated value would be: a10 =5+(.65)(4) + (.35)( 5). Following the highest probability path (i.e., the upper path with probability− .85), the decision maker could look ahead one additional column and come up with a revised estimated value: 10 10 a 0 = a + .65[.15( 3) + .5(4) + .2( 5) + .15( 4)]. − − − Such path extensions are the basic cognitive operation that we consider. Note that path extensions replace the reduced form cognitive operators in the model discussed in section 2. In that previous analysis we did not specify the nature of the cognitive operators, assuming only that the opera- tors had shadow cost q and revealed information about the available choices. In the current application, the cognitive operators reveal structural informa- tion about the decision problem. Path extensions refine the decision-maker’s expectations about the value of a starting row. We will also allow concatenated path extensions. Specifically, if the decision-maker executes a two-step path extension from a particular node, he twice follows the branch with the highest probability. Hence, starting from the last row of Figure 2, his updated estimate would be, 10 10 a 00 = a 0 +(.65)(.5)[.3(0) + .05(3) + .65(3)]. Such concatenated path extensions generalize naturally. A τ-step path extension looks τ columns more deeply into the tree. At each intermediate node the extension follows the branch with highest probability. A given path from a particular starting node can be extended in many different ways, since a path will generally have many subpaths. Let f represent one such path extension; f embeds information about the starting row, the subpath which is to be extended, and the number of steps in that extension. Let σf represent the standard deviation of the updated estimate resulting from application of f. Hence, σ2 = E(f(ai(f)) ai(f))2, f −

16 where i(f) represents the starting row which begins the path that f extends, ai(f) represents the initial estimate of the value of row i(f),andf(ai(f)) represents the updated value resulting from the path extension. We now apply the model of section 2. The decision-maker selects the path extension with the greatest expected benefit. Hence, the decision- maker picks the cognitive operation, f, which is maximally useful.

i(f) i(f) f ∗ arg max w(a a− , σf ) qf . (16) ≡ f − −

Note that qf is the cognitive cost associated with cognitive operator f. Hence, if f is a τ-step concatenated operation, then qf = τq. The decision- maker keeps identifying and executing such cognitive operations until he concludes his search. Specifically, we adopt the following three-step proce- dure.

Step 1 Evaluate the expected gain of the available cognitive operations. The expected gain of a cognitive operation is the option value of chang- ing one’s consumption choices as a result of insights generated by the cognitive operation. The expected gain of the best cognitive operation is given by:

i(f) i(f) G := max w(a a− , σf ) qf (17) f − − For the decision tree application, the cognitive operations are path extensions.

Step 2 Choose and execute the best cognitive operation, i.e f ∗ s.t. i(f) i(f) f ∗ =argmaxw(a a− , σf ) qf f − −

Apply f ∗, yielding updated information about the problem. For the decision tree application, the path extension with the highest expected gain is executed. Step 3 Keep cycling back to step 1, as long as the predicted marginal cost of another cycle of cognitive operations is less than the predicted mar- ginal benefit, i.e. as long as G Fq. When cognition costs exceed cognition benefits, stop thinking≥ and choose the consumption option which is currently best given what is already known about the prob- lem: choose the starting row with the maximal expected value. Here Fq is the fixed cost of solving the problem in equation 17.

17 Before turning to the formal presentation of the application, it is helpful to discuss several observations about the model. First, Step 1 is computationally complex. We view Step 1 as an ap- proximation of the less technical intuitive insights that decision-makers use to identify cognitive operations that are likely to be useful. Even if Step 1 is intuition-based, these intuitions generate some cost. We assume that each pass through Step 1 has shadow cost Fq. Second, if decision-makers use their intuition to identify cognitive oper- ations with high expected gains, they may sometimes make mistakes and fail to pick the operation with the highest expected gain. Hence, it may be appropriate to use a probabilistic logit model to predict which cognitive operation the decision-maker will execute in Step 2. However, we overlook this issue in the current paper and assume that the decision-maker always selects the cognitive operation with the highest expected gain. Third, in the decision tree application we use rational expectations to calculate σ, but many other approaches are sensible. For example, σ2 could be estimated using past observations of (f(ai(f)) ai(f))2. Fourth, we assume that Step 3 is backward− looking. Specifically, the decision-maker uses past information about the expected gains of cognitive operators to determine whether it is worth paying cost Fq to generate a new set of intuitions. No doubt this treatment could be improved in several ways. But, consistent with our modeling strategy, we chose the simplest approach that was broadly in accord the psychology and the “cognitive economics” of the task.

4 A formal statement of the model

This section formally describes our applied model. Because of the notational complexity, readers who are satisfied with the informal description of the directed cognition model, may wish to jump ahead to section 5.

4.1 Notation for Decision Trees The following notation is used to formally define the algorithm. A node n is a box in the game, and n0 represents a node in the next column. Let p(n n) represent the probability of the branch leading from n to n .5 The 0| 0 flow payoff of node n is un. For the games that we study the un have mean 0 and standard deviation σu. 5 If no such branch exists p(n0 n)=0. |

18 Apathθ is a set of connected nodes (n1, ..., nm),withnodeni in column i. Thestartingnodeisb(θ)=n1.Letπ(θ) represent the probability (along m 1 6 the path) of the last node of the path, i.e. π(θ)= − p(ni+1 ni). Let i=1 | l(θ)=nm represent the last node in path θ.Cis the total number of columns in a given decision tree and col (θ)=m is theQ column of the last box in θ. Call θ[1,j] =(n1, ..., nj) the truncation of θ to its j first nodes, and θ 1 =(n1, ..., nm 1) its truncation by the last node. If θ =(n1, ..., nm) is a path,− e(θ) is the− path constructed by adding to the last node of θ the node 7 with the highest probability branch: e(θ)=(n1, ..., nm+1), where

p(nm+1 nm)=sup p(n nm) n branching from nm . | { | | } If Θ is a set of examined paths, e(Θ) is the set of its extensions: i.e. a path θ =(n1,...,nm+1) e(Θ) iff (n1, ..., nm)=θ0 for some θ0 Θ (i.e. θ is ∈ [1,m] ∈ the extension of a segment of a path already in Θ), and (nm,nm+1) has the highest possible probability among branches starting from nm that are not already traced in Θ:

p(nm+1 nm)=sup p(n nm) there is no θ0 Θ, θ0 =(n1, ..., nm,n) | { | | ∈ [1,m+1] }

Note that extensions can branch from the “middle” of a pre-existing path. Atimeindext represents the accumulated number of incremental ex- tensions which have been implemented by the algorithm. Each tick in t represents an additional one node extension of a path. θ(t) is the path which was extended/examined during period t. Θ(t) is the current set of all paths that have been extended/examined (including θ(t)). At time t, (b, t), is the estimate of the value of starting node b. At time t, the “best alternative”E to starting node b is starting node a(b, t). If b is the best node, then a(b, t) is the second-best node. If b is not the best node, then a(b, t) is the best node.

4.2 The directed cognition algorithm The algorithm uses two cost variables. Recall that Step 1 of the algorithm generates intuitions about the relative merits of different cognitive opera- tions. Building these intuitions costs qF. Step 2 executes those cognitive operations. The cost of implementing a single cognitive operation is q, 6 If m =1, π(θ)=1. 7 In case of ties, take the top branch. Note that e(θ) is not defined if θ already has reached the last column, and cannot be extended.

19 and our algorithm allows the decision-maker to implement strings of such operations. Hence, Step 2 costs will always be multiples of q : τq where τ 0, 1,... ,C 2 . ∈To{ start the algorithm,− } set t =0. Let Θ(0) contain the set of paths made of starting boxes.

(b, 0) = un + π(θ(t + 1)) p(n0 n)un E | 0 n branching from n 0 X where n represents node (b, 1). Hence, (b, 0) represents the flow payoff of starting box b plus the probability weightedE payoffs of the boxes that branch from starting box b. Now iterate through three steps, beginning with Step 1.

4.2.1 Step 1

For θ e(Θ(t)), let [θ∗, τ ]= ∈ ∗

arg max w ( (b(θ),t) (a(b(θ),t),t), π(θ)σ(τ)) qτ, θ e(Θ(t)),0 τ C col(θ) E − E − ∈ ≤ ≤ − (18)

The interpretation of the maximum is how much gain we can expect to get by exploring path θ∗, τ ∗ steps ahead. Recall, b(θ) is the starting box of θ, and a(b, t) is the best alternative to b.8 In the informal description of the algorithm, we assumed that the new information has standard deviation σ. In the equation above, the standard deviation is π(θ)σ(τ). The function σ(τ) is the rational expectations standard deviation generated by a concate- nated τ-column extension of any path, given that the probability of reaching the beginning of the extended portion of the path is unity. The factor π(θ) adjusts for the probability of actually reaching the extended portion of the path. Recall that π(θ) is the probability of reaching the last node of path m 1 θ, i.e. π(θ)= i=1− p(ni+1 ni). The σ(τ) function is derived below, after Step 3. | Call G theQ value of the maximum in (18). Go to step 2.

8 We adopt the convention that max S = if S is empty. −∞

20 4.2.2 Step 2 In this step we execute the (concatenated) cognitive operation with the highest expected gain. We “explore path θ∗at depth τ ∗”. In detail: i. If τ ∗ =0,godirectlytoStep3: ii. Let θ(t + 1)=θ∗. In the set of examined paths, add θ(t + 1),and,ifit is the extension of a path θ0 at its terminal node, remove that path:

Θ(t + 1):=Θ(t) θ(t + 1) θ 1(t + 1) if θ 1(t + 1) Θ(t) ∪ { }\{ − } − ∈ :=Θ(t) θ(t + 1) otherwise ∪ { } iii. Update the value of the starting box b(θ(t + 1)). If b = b(θ(t + 1)), then set (b, t + 1)= (b, t). If b = b(θ(t + 1)), then set 6 (b, t + 1)= E E E

(b, t)+π(θ(t + 1)) p(n0 n)un (19) E | 0 n branching from n 0 X where n = l(θ(t + 1)). iv. Set τ = τ 1. Set t = t + 1. ∗ ∗ − v. If τ ∗ 1, set θ(t+1)=e(θ(t)) and set θ∗ = e(θ(t)). Then,gotosubstep ii above.≥ vi. If τ ∗ =0,gotoStep3:

4.2.3 Step 3 i. If G qF,gotoStep1. ≥ ii. If G

This ends the algorithm, but we have one remaining loose end. We still need to derive the standard deviation function σ(τ).

21 4.3 The value of σ(τ)

For a generic draw of probabilities at a node n0,wehaver(n0) branches and probabilities that we can order: p1(n ) ... p (n ). They are random 0 ≥ ≥ r(n0) 0 variables. Recall that σu represents the standard deviation of the payoff in abox. Exploring one column ahead will give an expected square value:

2 2 2 2 σ (1)=σuE[p1 + ... + pr] (20)

Exploring two steps ahead, following the highest probability path between columns, gives:

2 2 2 2 2 2 σ (2) = E[(p1 + ... + pr)σu + p1 σ (1)] 2 2 · =(1 + E[p1])σ (1) In general:

2 2 2 τ 1 2 σ (τ)=(1 + E[p1]+... + E[p1] − )σ (1) 2 τ 1 E[p ] 2 − 1 = 1 E[p2] σ (1) − 1 Consistently, we set σ2(0) = 0.

5 Other decision algorithms

In the next section, we empirically compare the directed cognition model to the rational actor model and three other choice models, which we call the column cutoff model, the discounting model, and the FTL model. The rational actor model with zero cognition cost is the standard assumption in almost all economic models. This corresponds to the limit case q =0.For presentational simplicity, we refer to this benchmark as the rational actor model, but we emphasize that this standard model assumes both rationality and zero cognition costs. The column cutoff model assumes that decision-makers calculate per- fectly rationally but pay attention to only the first Q columns of the C- column tree, completely ignoring the remaining C Q columns. The dis- counting model assumes that decision-makers follow− all paths, but exponen- tially discount payoffs according to the column in which those payoffsarise. Both of these algorithms are designed to capture the idea that decision- makers ignore information that is relatively less likely to be useful. For

22 example, the discounting model can be interpreted as a model in which the decision-maker has a reduced probability of seeing payoffsin“later” columns. We also consider a bounded rationality model that we called “follow the leaders” (FTL) in previous work (Gabaix and Laibson, 2000). This algorithm features a cutoff probability p.

From any starting box b follow all branches that have a prob- ability greater than or equal to p. Continue in this way, moving from left to right across the columns in the decision tree. If a branch has a probability less than p,considertheboxtowhich the branch leads but do not advance beyond that box. Weight all boxes that you consider with their associated cumulative proba- bilities, and calculate the weighted sum of the boxes.

ThisheuristicdescribestheFTLvalueofastartingboxb. In our econo- metric analysis, we estimate p = .30; perfect rationality with zero cognition cost corresponds to p =0. More precisely9, we take the average payoffs over trajectories starting from box b, with the transition probabilities indicated on the branches, but with the additional rule that the trajectory stops if the last transition prob- ability has been

C FTL(p) (b)= uni p (ni ni 1) ...p(n2 b)1p(ni 1 ni 2) p...1p(n2 b) p E | − | − | − ≥ | ≥ i=1 ni in column i (nj)j=2,...,i 1 X X in columnX −j (21) Recall that C is the number of columns in the decision-tree. For example, consider the following FTL (p = .60) calculation for the second row in Figure 2. FTL follows all paths that originate in the last row, stopping only when the last branch of a path has a probability less than p.

FTL(p) (b2)=0+2+.15 ( 2) + .01 ( 4) + .14 ( 4) + .7(5 + .5 1 + .5 1) E · − · − · − · · =5.3.

This FTL value can be compared to the rational value of 5.78. We consider FTL because it corresponds to some of our intuitions about how decision-makers “see” decision trees that do not contain large outlier 9 See Gabaix and Laibson (2000) for an alternative but equivalent description of FTL.

23 payoffs. Subjects follow paths that have high probability branches at every node. In the process of determining which branches to follow, subjects see low probability branches that branch off of high probability paths. Sub- jects encode such low probability branches, but subjects do not follow them deeper into the tree. All three of the quasi-rational algorithms described in this section, (col- umn cutoff, column discounting, and FTL) are likely to do a poor job when they are applied to a wider class of decision problems. None of these al- gorithms are based on a structural microfoundation (like cognition costs). The parameters in column cutoff, column discounting, and FTL are unlikely to generalize to other games. In addition, we do not believe that column cutoff, column discounting, and FTL are as psychologically plausible as the directed cognition model. Only the directed cognition model features the property of “thinking about when to stop thinking,” an important compo- nent of any decision task. For all of these reasons we do not propose column cutoff, column discounting, and FTL as general models of bounded ratio- nality. However, they are interesting benchmarks with which to compare the directed cognition model. Finally, we need to close all of our models by providing a theory that translates payoff evaluations into choices. Since subjects rarely agree on the “best choice,” it would be foolish to assume that subjects all choose the starting row with the highest evaluation (whether that evaluation is rational or quasi-rational). Instead, we assume that actions with the highest evaluations will be chosen with the greatest frequency. Specifically, assume that there are B possible choices with evaluations 1, 2,... , B given by one of the candidate models. Then the probability{E of pickingE boxE }b is given by:

exp(β b) Pb = B E . (22) exp(β b ) b0=1 E 0 We estimate the nuisance parameterP β in our econometric analysis.

6 Experimental Design

We tested the model using experimental data from 259 subjects recruited from the general Harvard undergraduate population. Subjects were guar- anteed a minimum payment of $7 and could receive more if they performed well. They received an average payment of $20.08.

24 6.1 Decision Trees The experiment was based around twelve randomly generated trees, one of which is reproduced in Figure 2. We chose large trees, because we wanted undergraduates to use heuristics to “solve” these problems. Half of the trees have 10 rows and 5 columns of payoff boxes; the other half of the trees are 10 10. The branching structure which exits from each box was chosen randomly× and is iid across boxes within each tree. The distributions that we used and the twelve resulting trees themselves are presented in an accompanying working paper. On average the twelve trees have approximately three exit branches per box (excluding the boxes in the last column of each tree which have no exit branches). The payoffs in the trees were also chosen randomly and are iid across boxes.10 The distribution is given below:

uniform[-5,5] probability .9 individual box payoff = uniform[-5,5]+uniform[-10,10] probability .1 ½

Payoffs were rounded to whole numbers and these rounded numbers were used in the payoff boxes. Experimental instructions (see appendix 1) described the concept of ex- pected value to the subjects. Subjects were told to choose one starting row from each of the twelve trees. Subjects were told that one of the trees would be randomly selected and that they would be paid the true expected value for the starting box that they chose in that tree. Subjects were given a maximal time of 40 minutes to read the experimental instructions and make all twelve of their choices. We analyzed data only from the 251 subjects who made a choice in all of the twelve decision trees.

6.2 Debriefing form and expected value calculations After passing in their completed tree forms, subjects filled out a debrief- ing form which asked them to “describe how [they] approached the problem of finding the best starting boxes in the 12 games.” Subjects were also asked for background information (e.g., major and math/statistics/economics back- ground). Subjects were then asked to make 14 expected value calculations in simple trees (three 2 2 trees , three 2 3 trees, and one 2 4 tree). Sub- jects were told to estimate× the expected× value of each starting× row in these 10One caveat applies. The payoffsintheboxesinthefirst three columns were drawn from the uniform distribution with support [ 5, 5]. −

25 trees. Figure 3 provides an example of one such tree. Subjects were told that they would be given a fixedreward( $1.25) for every expected value that they calculated correctly (i.e., within≈ 10% of the exact answer). We gave subjects these expected value questions because we wanted to identify any subjects who did not understand the concept of expected value. We exclude any subject from our econometric analysis who did not answer cor- rectly at least seven of the fourteen expected value questions. Out of the 251 subjects who provided an answer for all twelve decision trees, 230, or 92%, answered correctly at least seven of the fourteen expected value ques- tions. Out of this subpopulation, the median score was 12 out of 14 correct; 190 out of 230 answered at least one of the two expected values in Figure 3 correctly.

7 Statistical analysis

We restrict analysis to the 230 subjects who met our inclusion criteria, but the results are almost identical when we include all 251 subjects who completed the decision trees. Had subjects chosen randomly, they would have chosen starting boxes with average payoff $1.30.11 Had the subjects chosen the starting boxes with the highest payoff, the average chosen payoff wouldhavebeen$9.74. In fact, subjects chose starting boxes with average payoff $6.72. We are interested in comparing five different classes of models: rational- ity, directed cognition, column cutoff, discounting, and FTL. All of these models have a nuisance parameter, β, which captures the tendency to pick starting boxes with the highest evaluations (see equation 22). For each model, we estimate a different β parameter. Directed cognition, column cutoff, discounting, and FTL also require an additional parameter which we estimate in our econometric analysis.

7.1 Formal model testing There are G = 12 games, B = 10 starting boxes per game, and I =230 subjects who satisfy our inclusion criterion (they chose a row in all twelve decision trees and answered correctly at least half of the expected value 11In theory this would have been zero since our games our randomly drawn from a distribution with zero mean. Our realized value is well within the two-standard error bands.

26 questions on the post-experiment debriefing form). Call Xi RBG the i ∈ choice vector such that Xbg = 1 if subject i played box b at game g, and 0 otherwise. Call P = E[Xi] the true distribution of choices, P := 1 ( I Xi) the empirical distribution of choices, Σ = E[XiXi ] PP the I i=1 0 − 0 (BG BG) variance-covariance matrix of the choices Xi, and its estimatorb P× I i i Σ := ( i=1 X X 0/I PP 0)/(1 1/I). Say that a model−m (Rationality,− BR, FTL etc.) gives the value m to Ebg boxb b Pin game g. Recallb b that we parameterize the link between this value and the probability that a player will choose b in game g by

m m exp(β bg) Pbg (β)=(P( , β))bg = B E E exp(β b g) b0=1 E 0 for a given β. To determine β, we minimizeP the distance (in RBG)between the empirical distribution P and the distribution predicted by the model, given β.Specifically, we take b 1/2 2 BG P = Pbg for P R k k ∈ g à b ! X X Lm P,β = P m(β) P k − k ³m ´ m L b P =minL P,b β β R+ ³ ´ ∈ ³ ´ m b m b m m and set β := arg minβ R+ L P,β . We call L = L P the empirical ∈ m distance between the model and³ the´ data. Call Lm and ³β ´the analogues m m for the trueb distribution of choices,b i.e. L = Minb β P (β)b P . k − k We now derive statistics12 on Lm. ε := P P has a distribution such − m m that √Iε N(0, Σ) for large I0s.Becauseε is small, L P = L (P)+ m ∼ m m ∂P L (P) ε + O(1/I), and as, byb the envelopeb theorem, ∂³P L´ (P)=∆ , where: · b P P m βm m bg bg ( ) ∆bg = − (23) 2 1/2 m m b Pbb g P (β ) 0 0 − b0g µ ¶ P ³ ´ we get: b 12We use the fact, exploited by Vuong (1989) and Hansen-Heaton-Luttmer (1995), that m we do not have to explicitly derive the distribution of β to get the distribution of Lm.

b b 27 Lm = Lm + ∆m ε + O (1/I) (24) · where is the scalar product in BG. b R Suppose· that ∆m =0,i.e.themodelm is misspecified (this will be the case for all the models6 considered 13). Equation (24) implies that √I(Lm Lm) N(0, ∆m Σ∆m), which gives us a consistent estimator of the standard− ∼ 0 deviation on Lm: b 1/2 m m b σLm = ∆ 0Σ∆ /I . (25) We will be interested in the³ difference Lm´ Ln of the performance of b d b d two theories m and n. Equation (24) implies that− Lm Ln = Lm Ln +(∆m ∆n) ε + O (1/I) − − − · so that we get a consistent estimator of the standard deviation of Lm Lm0 : b b −

1/2 b b m n m n σLm Ln = (∆ ∆ )0Σ(∆ ∆ )/I . (26) − − − Those formulae carry³ over to models that have been´ optimized. Our b d c b d c models have a free parameter: in directed cognition, the parameter is q, the cost (per box) of cognition; in column-discounting, it is δ,thediscount rate;incolumncutoff,itisQ,thefinal column analyzed; in FTL it is p,the cutoff probability. Say that in general, when we consider a class of models M, it can have a real (or vector) parameter x: M = M (x) ,x XM . For instance, if M = FTL,andx = .25,thenM(x) represents{ FTL∈ with} probability cutoff p = .25, and the space of parameter x is XFTL =[0, 1]. The space can be discrete: for M =Column Cutoff,XM = 1, 2, ..., 10 . For each x, M(x) gives a vector of values M (x) fortheboxes,andhencea{ } distance to the empirical distribution, forE a given β : P M (x), β P . For model M, we can optimize over parameter x to||findE the best possible− || ¡ ¢ fit by calculating: b LM := min min P M (x), β P (27) x XM β R+ || E − || ∈ ∈ M ¡ ¢ Also call xM andb β the argmin values in (25).b As this is again an application of the delta method, the standard errors calculated in (25) and (26) are still valid,b evenb though we optimize on parameter x. 13 m m AtestthatL =0can be readily performed using the estimates L and σLm given in the tables. 6 b b 28 7.2 Empirical results For the directed cognition model we estimate q = .012, implying that the shadowpriceofextendingapathone-columndeeperintothetreeisalittle above 1 cent. Given the amount of time that subjects had, this q value implies a time cost of 2.25 seconds per one-column extension.14 We view both of these numbers as reasonable. We had chosen 1 cent per box and 1 second per box as our preferred values before we estimated these values. For the column-cutoff model, we estimate Q =8, implying that the decision-maker evaluates only the first 8 columns of each tree (and hence all of the columns in 10 5 trees). For the column-discounting model we estimate δ = .91, implying× that the probability that the decision-maker sees a payoff in column c is (.91)c. Finally, for FTL we estimate p = .30, implying that the decision-maker follows branches until she encounters a branch probability which is less than .30. We had exogenously imposed a p value of .25 in our previous work (Gabaix and Laibson, 2000). Table 2 reports test statistics for each of our models. The rows of the table represent the different models, m rational, directed cognition, column-cutoff, column-discounting, FTL . Column∈ { 1 reports distance met- ric Lm, whichmeasuresthedistancebetweentheestimatedmodelandthe} empirical data. Columns 2-5 report the differences Lm Lm0 , and the asso- − ciatedb asymptotic standard errors. When Lm Lm0 > 0, model m performs − relatively poorly, since the distance of model m fromb theb empirical data m (L ) is greater than the distance of the modelb mb 0 from the empirical data (Lm0 ). b b As column 2 shows, all of the decision cost models beat rationality. The differences between the distance measures are: Lrational Ldirected cognition = 0.190, Lrational Ldirected cognition =0.399, Lrational L−column cutoff =0.221, − − Lrational Lcolumn discounting =0.190,andLrationalb LFTLb=0.488.Allof − − these dibfferences haveb associated t-statisticsb of at leastb 5. b Amongb the decision cost models theb two winnersb are directed cogni- tion and FTL. Both of these models dominate column cutoff and column discounting. The difference between directed cognition and FTL is not sta- tistically significant. The general success of the directed cognition model is somewhat surprising, since column cutoff, column discounting, and FTL were designed explicitly for this game against nature, whereas directed cog- nition is a far more generally applicable algorithm. 14Explain calculation.

29 8 Applications of the directed cognition model

8.1 N-player sequential games of perfect information The model has a general structure – mental operators, selected on the basis of their expected net payoffs, evaluated via a proxy value function – that can be used to predict behavior in a wide range of settings. Here we sketch such an application to N player sequential games of perfect information. In this game theory application,− players choose among two types of cogni- tive operators: “forward sweeps” and “backward sweeps.” Forward sweeps simulate game paths from left to right, following the arrow of time, and following branches that are predicted to convey the most useful information about the characteristics of the game. Backward sweeps execute constrained backward induction operations, where the backward induction is started at some selected node in the middle of the game, and may follow a subset of paths backward from that node. The quantity of anticipated information, σf , associated with these operations is approximated and used with the w (x, σ) function, to derive expected gains of those cognitive operations. The application uses the same three-step procedure as above. Step 1 evaluates the expected gains from different forward and backward sweeps. Step 2 executes the sweep with the highest expected gain. Step 3 iterates back to the beginning of the cycle, until G

8.2 Salience (attention) The directed cognition model implies that the decision-maker only thinks about a small subset of the aspects of the decision-tree. In this sense, the directed cognition model provides a microfoundation for the psychological concepts of salience and attention. The model predicts that the decision- maker will only pay attention to a small amount of relevant information. By and large, most information in the decision-tree (and in life in general) will be ignored. First, the decision-maker will not consider information that is not rele- vant to her decision. Decision-makers won’t extend/see paths that start in low-payoff rows. For example, the algorithm does not extend/see any paths in Figure 2 that start in row seven. This row does not catch the eye of the decision-maker, or the attention of the algorithm, since the starting payoffs are not sufficiently promising. In addition, the decision-maker will not consider information that is not likely to make a big difference to her decision. For example, the decision-

30 maker won’t extend/see paths that start with low probability branches. For example, the algorithm extends row 10, and this extension follows the branch with probability .65. The other branch (with probability .35) is never extended/seen past column two. These salience properties will generalize to other settings. We attend to information sources that are likely to hold large amounts of information that matter for decisions that we will take. Thus, at a cocktail party we immediately tune into a conversation across the room if we hear our own name spoken. We sensibly focus on information that matters to us and screen out the rest.

8.3 Myopia The directed cognition model generates decisions that necessarily look my- opic from the perspective of a perfectly rational outsider. The algorithm executes operations that reveal relatively large amounts of information about expected payoffs. In almost all settings, the near-term future is more pre- dictable than the distant future, so cost-effective information revelation will tend to overlook long-term events. Long-term events tend to be low prob- ability events and hence are not “worth” evaluating. For example, our algorithm endogenously decides to ignore columns nine and ten in Figure 2. The algorithm estimates that the expected value of information in these columnsismorethanoffset by the cognitive costs of processing that informa- tion. Hence, the model predicts that decision-makers will act as if the “fu- ture” doesn’t matter. The decision-makers are sophisticatedly ignoring the future, since the low level of predictability makes analysis cost-ineffective.

8.4 Comparative statics with incentives The directed cognition model generates quantitative predictions about the role of incentives. For example, the model predicts what will happen as in- centives are multiplicatively adjusted from a to λa. The equilibrium strategy + + maximizes λaτ qτ = λ(aτ qτ/λ). So the strategies will adjust in the same way that they would− have if− shadow cost q were replaced by q/λ. Naturally, higher incentives induce longer thinking and more accurate choices.15 The model makes quantitative (testable) predictions about these relationships. 15A simple measure of the performance of the choice is the expected consumption utility of the decision maker, E[a 1a >0] ∞ T

31 8.5 Anchoring and adjustment A large psychology literature has studied the phenomena of anchoring and adjustment.16 When subjects think about a problem, they begin with an initial guess that is anchored to initial information. After further thought, their answers tend to adjust toward the correct answer. The more they are encouraged/allowed to think about the problem, the more adjustment occurs, and the closer is the final answer to the correct answer. The directed cognition model generates exactly this dynamic adjustment process. The model explains why initial guess are distinct from final answers, and predicts quantitatively how additional thinking will affect the respondent’s payoffs.

8.6 Contracts Many economists believe that incomplete contracts arise because the costs of writing detailed contingent contracts offset the probabilistic benefits. Economists routinely assume that contract costs exist, but rarely formally model those costs. The directed cognition model provides a preliminary step toward quantitatively measuring cognition costs that arise when contracts are written or read. For example, our decision-tree application generates im- plied variation in the cognition costs required to satisfactorily analyze each of our twelve decision trees. By providing a quantitative theory of cogni- tive costs, the model can predict what situations will lead parties to avoid contract/cognition costs. Hence extensions of the model can predict when parties will write incomplete contracts or rely on boiler-plate contracts.17

9Conclusion

We have described a model of boundedly rational choice. This model is modular and there are many improvements that could be made to each of its parts: expanding the set of mental operators, improving the sophistica- tionofthemyopicproxyvaluefunction,addingtremblesinthechoiceof the best cognitive operation, etc. In general, we have adopted the simplest possible version of each of the model’s features. We wish to highlight two important shortcomings of the model: its practical implementation requires some computer programming and it has so far only been applied to a re- stricted class of decision problems. On a positive note, we believe that the 16See Fiske and Taylor (1991, p.389) for a review. 17Boiler-plate contracts do not need to be read because they are ‘fair’ or already well- understood.

32 model realizes three goals. First, the model is psychologically plausible – indeed subjects claim to use strategies similar to those in the model. Second the model generates precise quantitative predictions and can be empirically tested; the data rejects the rational model in favor of our bounded rational- ity alternative. Third, the general structure of the model (choice of mental operators, selected on basis of their expected payoffs, evaluated via a proxy value function) can be applied to a wide class of problems. In another paper, we are using this framework to analyze N player games. We hope that applications to more concrete economic applications,− such as contracts, savings behavior and portfolio choice, will follow.

10 Appendix A: Derivation of the value function for the divine and directed cognition models

The value function, V (a, t), is characterized by a partial differential equation in the continuation region,

1 2 0= q + Vt + σ (t)Vaa. − 2 Because of the symmetry of the problem, the decision-maker will stop when at crosses either of the time-contingent thresholds at and at. At those − thresholds standard value matching conditions apply: V ( at,t)=0, V (at,t)= − at. Smooth pasting conditions also apply when the thresholds are chosen optimally, V ( at,t)=0,Va(at,t)=1. Let T(a, t)−represent the expectation of the remaining time spent on the problem,

T (a, t)=E[τ at = a] t. | − So T(a, t) is the solution of the differential equation, 1 0=1 + T + σ2(t)T =0 (28) t 2 aa with value matching conditions T ( at,t)=0and T ( at,t)=0. Let C(a, t) represent the expected− value of consumption,− not including cognition costs. Then + C(a, t)=E[aT 1aT >0]=E[aT ]. The differential equation that characterizes C0 TD 0 Tc ( -0.D (a)Tj /TT16 1Tf 8.04 08.04 296./TT141.5(i)1s,848 TD (!)Tj /T-(2434818 1Tf 0.7826 08.04 316.9606 246.1201 Tm (2)T 116T131TT2404 011.04 321.6406 241.6801 Tm (() 1..71 Tf 2/TT183804 0TD (t)Tj /TT18 1Tf 8.04 0TD ())Tj /TT13 1Tf 0.3913 0TD (T)Tj /TT16 1Tf 223.8 5.6C.04 296.0806 167.5201 Tm (T)Tje S0=)Tj 244.383804 Tf 11.04 011.04 359.2006 217.44 Tm (a)Tj /T5aa with value matching conditions C(at,t)=at and C( at,t)=0. We consider two special cases which admit analytic− solutions: σ2(t)=σ2 and σ2(t)=maxσ2(0) 2βt, 0 . We refer to these respectively as the constant and affine{ cases.− In the} constant case, the problem is stationary and t drops out of all expressions. The solution to the Bellman equation is,

q V (a, t)= a2 + C a + C (30) σ2 1 2 with constants C1 and C2 determined by the value matching conditions. Smooth pasting conditions are also applied when we solve for the optimal V . For example, in the optimal case, q a σ2 1 V (a, t)= a2 + + α ( α) σ2 2 q 2 −

σ2 1 where a is the optimal stopping threshold and a = α q ,forα = 4 . Using equation (28) we can show that the expected thinking time re- maining is

T (a, t)= a2/σ2 + α2σ2/q2. (31) − Using the equation (29) we can show that the expected consumption utility is 1 C(a, t)= ασ2 + qa . (32) 2 Equations (31) and (32) apply to both¡ the rational¢ case and the directed cognition cases of our model. When the variance declines linearly (the affine case), it is convenient to let t =0represent the moment at which variance falls to zero. Then we have

2 2 σ (t)= 2s t,fort [t0, 0],t0 < 0 − ∈ =0for t>0.

Now the solution of the Bellman equation is a relatively simple expression,

V (a, t)=C1v(a, st)+C2a + qt, (33) − where a a v(a, σ)=aΦ + σφ (34) σ σ ³ ´ ³ ´ 34 with Φ ( ) is the normal distribution function and φ ( ) is the normal density. Note that· ·

v (a, σ) E[(a + σu)+] ≡ for a standard normal u. The solution proposed in equation (33) can be verified by noting, a v (a, σ)=Φ( ), a σ a v (a, σ)=φ( ). σ σ

Solving for constants C1 and C2 implies that the value function is g.e(n)-3 Taking the supt of both sides we get 1 S1(x, σ) x/2+sup σ(0,t) qt ≥ t √2π − = x/2+S0(0, 2/πσ).

Applying max( ,x+) to both sides, we get:p S1(x, σ) S0(x, 2/πσ). · ≥ 1 1 1 + p 1 2. S (x, σ) V (x, σ):Call S (xt,t,k)=E xt+k qk ,sothatS (x, t)= ≤1 1 − 1 supk R+ S (x, t, k).As,forfor0 s k, S (xt,t,k)=E S (xt+s,t+ s, k s) ∈ 1 ≤ ≤ £ ¤ − − qs E S (xt+s,t+ s) qs, and taking the supk R+ on the left- ≤ 1 − 1 ∈ £ ¤ hand side, we get S (xt,t) E S (xt+s,t+ s) qs for s small enough. As£ V 1(x ,t)=E¤ V 1(≤x ,t+ s) qs,weget,for− D1(x ,t)= t t+s £ ¤ t V 1(x ,t) S1(x ,t),thatD1(x ,t) E −D1(x ,t+ s) ,i.e.that t t £ t ¤ t+s D1 is a submartingale.− The search≥ process stops at a random time 1 1 £+ 1 ¤ T,andthenS (xT ,T)=V (xT ,T)=xT ,soD (xT ,T)=0.As 1 1 1 1 D (xt,t) E D (xT ,T) =0, we can conclude V (xt,t) S (xt,t). ≥ ≥ 3. V 1 (x, σ ( )) £ V (x, σ ( ))¤ : this is immediate because this holds for any politicy· ≤wi,asV is· the best possible policy.

4. First, observe (with, again, x0 = x)

E[max(xτ , 0)] = E[xτ /2+ xτ /2] by the identity max(x, 0) = (x + x )/2 | | | | = x/2+E[ xτ ]/2 as xt is a martingale which stops a.s. | | x/2+E[x2]1/2/2 ≤ τ 2 2 1/2 2 2 = x/2+(x + E[σ(0,t) ]) /2 as xt σ(0,t) is a martingale − (39) 2 2 1/2 2 x/2+(x + σ(0,m) ) /2 with m = E [τ x0 = x] as σ (0,t) is concave ≤ | so we get: V (x) x/2+(x2 + m)1/2/2 qm, hence with ≤ − W 2(x, σ)=x/2+ sup(x2 + σ(0,t)2)1/2/2 qt t R+ − ∈ we get V (x) W 2 (x). We just have to prove W 2 = S2, i.e. ≤ x/2+(x2 + s2)1/2/2=w2(x, s)

which can be done by direct calculation of w2.

36 5. We have, because √a + b √a + √b : ≤ x/2+ x2 + σ2(0,t)/2 qt x + √x2 /2+σ(0,t)/2 qt − ≤ − p = ³x+ + σ(0,t´)/2 qt −

so by taking the supt of both sides we get:

S2(x, σ) x+ + S0(0, σ) ≤ whoseright-handsideislessthanorequaltoS0(x, σ)+min( x ,S0(0, σ))/2 0 | | by direct verification using the definition of w .¤

12 References

Bazerman, Max and William F. Samuelson “I won the auction but don’t want the prize,” Journal of Conflict Resolution, December 1983, 27, 618-34.

Camerer, Colin, E. J. Johnson, T. Rymon, and S. Sen, “Cognition and Framing in Sequential Bargaining for Gains and Losses,” in Ken Bin- more, Alan Kirman, and P. Tani, eds., Frontiers of game theory. Cam- bridge, MA: MIT Press, 1994, pp. 27-47.

Camerer, Colin and Ho, Teck-Hua “Experience weighted attraction learn- ing in normal-form games.” Econometrica 1998 pp.

Conlisk, John. “Why bounded rationality?” J. Econ. Literature,June 1996, 34 (2), pp.669-700.

Erev, Ido and Roth, Alvin. “Predicting how people play games”. Amer. Econ. Rev., Sept. 1998, 88 (2), pp.848-81.

Fiske, Susan and Shelley Taylor. “Social cognition”. New York, NY: McGraw-Hill, 1991.

Gabaix, Xavier and Laibson, David. “A boundedly rational decision algo- rithm” AER Papers and Proceedings, May 200.

Gittins, J.C. (1979) “Bandit processes and dynamic allocation indices”, Journal of the Royal Statistical Society B, 41, 2, p.148-77.

37 Hansen, Lars; Heaton, Jonathan; and Luttmer, Erzo. “Econometric eval- uation of asset pricing models” Rev. Fin. Stud., Summer 1995, 8 (1), pp.237-274.

Jehiel, Philippe. “Limited horizon forecast in repeated alternate games,” Journal of Economic Theory, 1995, 67: 497-519.

Kahneman, Daniel and Tversky, Amos. “Judgment Under Uncertainty: Heuristics and Biases”. Science, 1974.

Macleod, W. Bentley. (1999) “Complexity, Bounded Rationality and Heuris- tic Search,” mimeo, University of Southern California.

Moscarini, Giuseppe, and Lones Smith. “The Optimal Level of Experi- mentation,” Yale and University of Michigan mimeo.

Mullainathan, Sendhil, “A memory-based model of bounded rationality”, MIT mimeo, 1998.

Payne, John W., James R. Bettman, and Eric J. Johnson, The Adaptive Decision Maker, 1993, Cambridge: Cambridge University Press.

Roberts, Kevin and Martin Weitzman (1980) “On a general approach to search and. information gathering”, unpublished.

Rubinstein, Ariel. (1998) Modeling Bounded Rationality. MIT Press: Cam- bridge.

Simon, Herbert. “A behavioral model of rational choice,” Quart. J. Econ., Feb 1955, 69 (1).

Thaler, Richard. Quasi Rational Economics, New York: Russell Sage, 1991.

Thaler, Richard. The winner’s curse. New York: Russell Sage, 1994.

Vuong, Q. H. “Likelihood ratio tests for model selection and non-tested hypothesis”, Econometrica, 1989, 57 (XXX), pp.307-33.

Weitzman, Martin (1979) “Optimal search for the best alternative”, Econo- metrica 47, 3, p.641-54.

38 .25 .55 .15 .15 .05 .05 .33 .05 .1 .6 .33 .2 .2 .3 .01 .95 .9 .4 .67 a .07 .75 .05 .01 0 .33 2 -2 -3 -2 .05 -2 -1 -2 2 .5 0

.05 .25 .15 .15 .65 .5 .05 .4 .01 .5 .05 .15 .45 .4 .07 .05 .14 .2 .1 .85 .55 .1 .33 .2 b .7 .15 .2 0 2 -4 0 -2 -2 .5 13 -4 .1 -5 2

.01 .1 .05 .85 .03 .33 .8 .85 .05 .01 .15 .04 .05 .09 .15 .52 .05 c .05 .01 3 -2 -4 1 4 .8 0 -5 -5 3 .01 0

.05 .15 .15 .6 .55 .5 .04 .4 .05 .2 .45 .5 .8 .6 .2 .05 d .01 .6 1 -1 5 1 -7 .1 -1 3 5 -3 4

.35 .85 .6 .45 .25 .01 .01 .6 .25 .85 .25 .5 .7 .15 .03 .07 .55 .15 .14 .05 .05 .14 .01 .33 .2 e .01 4 0 -2 4 3 .35 3 .1 -5 5 -5 1

.3 .1 .01 .75 .15 .95 .49 .55 .66 .01 .24 .3 .01 .5 .1 .29 .18 .01 .1 .04 .01 .1 .05 f .8 5 1 .15 1 -4 -8 3 5 .15 -1 3 -2

.01 .15 .3 .75 .33 .95 .75 .75 .3 .25 .67 .05 g 1 -2 .24 -3 0 -5 -2 3 .1 2 2 .4 4

.85 .25 .01 .25 .01 .14 .3 .25 .3 .01 .75 .2 .03 .25 .05 .1 .03 .2 .25 .05 .01 .01 .65 .65 .01 .79 h .5 -5 .1 5 .35 4 3 3 .65 -4 -5 -6 -4 -3

.15 .15 .03 .45 .05 .1 .5 .01 .05 .9 .75 .2 .01 .3 .05 .15 i .15 .95 -1 4 -5 3 1 .05 -1 -1 0 -4 -1

.01 .7 .52 .65 .25 .65 .04 .33 .33 .65 .3 .5 .35 .25 .1 .67 .35 .01 .25 j .01 .05 5 -5 .03 -4 1 -4 -1 0 -1 5 5 Table 1: Predictions of the algorithms for the game in Figure 1

Values Probabilities (%) Column Column Column Column Box Rational Dir. Cog. Cutoff Discount. FTL Empirical Rational Dir. Cog. Cutoff Discount. FTL 1 -0.57 0.33 0.39 -0.07 0.77 6.1 3.1 3.6 2.8 3.2 2.7 2 5.78 5.75 6.76 5.37 5.88 4.3 17.6 17.9 17.8 17.0 18.2 3 -1.31 0.91 -0.27 -0.20 -1.90 2.6 2.5 4.3 2.3 3.0 1.0 4 4.11 0.00 4.96 3.65 4.35 3.0 11.1 3.3 10.6 10.0 10.3 5 6.19 3.60 6.99 5.83 6.22 11.3 19.8 9.5 19.0 19.5 20.6 6 6.88 5.36 7.68 6.56 6.21 23.9 23.9 16.0 23.1 24.5 20.6 7 -7.29 -1.00 -6.27 -5.40 -8.29 1.3 0.5 2.5 0.4 0.6 0.1 8 -1.89 -4.48 -0.88 -2.25 -1.36 1.7 2.1 0.9 2.0 1.6 1.2 9 3.00 5.06 4.59 2.70 4.64 9.1 8.2 14.6 9.5 7.4 11.5 10 4.12 7.19 5.54 4.56 5.13 36.5 11.2 27.4 12.5 13.2 13.8 Table 2 m=Model L(m) L(m) - L(Rational) L(m) - L(DC) L(m) - L(CC) L(m) - L(CD) Rational 2.969 (0.120) Directed cognition (q=0.012) 2.570 -0.399 (0.107) (0.080) Column Cutoff (Q=8) 2.748 -0.221 0.177 (0.119) (0.026) (0.071) Column Discounting (d=.91) 2.780 -0.190 0.209 0.032 (0.120) (0.037) (0.070) (0.017) FTL (p=0.30) 2.481 -0.488 -0.089 -0.267 -0.298 (0.095) (0.046) (0.057) (0.033) (0.034)

L(m) is the distance between the predictions of model m and the empirical data. Parameters q, Q, d and p have been set to maximize the fit of the models they correspond to. Standard errors are in parentheses.