<<

University of Pennsylvania ScholarlyCommons

Marketing Papers Wharton Faculty Research

9-2006

Extending the Bounds of Rationality: Evidence and Theories of Preferential Choice

Jörg Rieskamp

Jerome R. Busemeyer

Barbara Mellers University of Pennsylvania

Follow this and additional works at: https://repository.upenn.edu/marketing_papers

Part of the Behavioral Commons, and Commons, Cognitive Commons, and the Marketing Commons

Recommended Citation Rieskamp, J., Busemeyer, J. R., & Mellers, B. (2006). Extending the Bounds of Rationality: Evidence and Theories of Preferential Choice. Journal of Economic Literature, 44 (3), 631-661. http://dx.doi.org/ 10.1257/jel.44.3.631

This paper is posted at ScholarlyCommons. https://repository.upenn.edu/marketing_papers/414 For more information, please contact [email protected]. Extending the Bounds of Rationality: Evidence and Theories of Preferential Choice

Abstract Most define ationalityr in terms of consistency principles. These principles place "bounds" on rationality—bounds that range from perfect consistency to weak stochastic transitivity. Several decades of research on preferential choice has demonstrated how and when people violate these bounds. Many of these violations are interconnected and reflect systematic behavioral principles. We discuss the robustness of thes violations and review the theories that are able to predict them. We further discuss the adaptive functions of the violations. From this perspective, choices do more than reveal preferences; they also reflect subtle, yet often quite reasonable, dependencies on the environment.

Disciplines | Business | Cognition and Perception | | Marketing

This journal article is available at ScholarlyCommons: https://repository.upenn.edu/marketing_papers/414 Journal of Economic Literature Vol. XLIV (September 2006), pp. 631-661

Extending the Bounds of Rationality: Evidence and Theories of Preferential Choice

JORG RIESKAMP, JEROME R. BUSEMEYER, AND BARBARA A. MELLERS*

Most economists define rationality in terms of consistency principles. These principles place "bounds" on rationality-bounds that range from perfect consistency to weak stochastic transitivity. Several decades of research on preferential choice has demon- strated how and when people violate these bounds. Many of these violations are inter- connected and reflect systematic behavioral principles. We discuss the robustness of the violations and review the theories that are able to predict them. We further discuss the adaptive functions of the violations. From this perspective, choices do more than reveal preferences; they also reflect subtle, yet often quite reasonable, dependencies on the environment.

1. Introduction for chocolate over vanilla ice cream is "rational" without knowing the atti- Preferences are inherently subjective and tudes, values, , beliefs, and goals arise from a mixture of aspirations, of that individual. For this reason, most the- thoughts, motives, , beliefs, and orists evaluate the rationality of behavior desires. This inherent subjectivity means using principles of consistency and coher- that preferences are not easily evaluated ence within a system of preferences and against objective criteria without knowledge beliefs. of an individual's goals. For example, it is The view that rationality only refers to impossible to know whether an individual's internal coherence and logical consistency has been sharply criticized on multiple * Rieskamp: Max Planck Institute for Human grounds. Many have argued (e.g., Gerd Development, Berlin, Germany. Busemeyer: Indiana Gigerenzer 1996a) that consistency princi- University. Mellers: University of , Berkeley. We gratefully acknowledge helpful comments on previous ver- ples are insufficient for defining rationality. sions of this article by Claudia Gonzalez-Vallejo, Mark J. If the achievement of an individual's goal Machina, Anthony A. Marley, and Peter P. Wakker. We also does not imply consistency, it is questionable thank Donna Alexander for editing a draft of this article. whether functional behavior that violates The second author was partially supported by the China Europe International Business School and by NSF SES- consistency principles should be called "irra- 0083511 awarded to Indiana University, Bloomington, and tional." In addition, people are imperfect the third author was partially supported by NSF SES- 0111944 awarded to the University of California, information processors and limited in both Berkeley. knowledge and computational capacity.

631 632 Journal of Economic Literature, Vol. XLIV (September 2006)

Figure 1. Five Consistency Principles and Their Relatedness

Herbert A. Simon (1956, 1983) and others principles that go beyond assumptions about (Gigerenzer and 2001) use the properties or attributes of the choice the term to describe objects. For example, the transitivity axiom is human behavior. Consistency violations applicable to a wide range of choice objects, might well be adaptive when time, knowl- including both risky gambles as well as edge, or resources are scarce (e.g., Robert J. free commodity bundles. Aumann 1962; Mark J. Machina 1982; We organize our paper around five con- Michele Cohen and Jean-Yves Jaffray 1988; sistency principles that are summarized in Gigerenzer 1991, 1996b; Lola L. Lopes and figure 1. Each principle helps to define a Gregg C. Oden 1991; Kenneth R. class of theories. The principles differ in Hammond 1996; Lopes 1996; Philippe type (logically independent principles-one Mongin 2000). principle does not imply anything about Several decades of research on preferential another), and strength (logically dependent choice has led to an impressive body of principles-a stronger principle implies a empirical knowledge on how and when peo- weaker one). The principles have been ple violate consistency principles of preferen- labeled from I to V, but they only allow a tial choice. The purpose of this article is to partial ordering of constraints. review these basic facts. Our review differs The first principle is perfect consistency from previous reviews (e.g., Paul J. H. or invariant preferences across occasions. Schoemaker 1982; Chris Starmer 2000) in Perfect consistency implies the other four terms of the consistency principles we exam- principles. The second principle, strong sto- ine. Previous reviews focused primarily on chastic transitivity, is logically equivalent to consistency across operations applied to the third principle, independence from irrel- properties of choice objects. For example, the evant alternatives ( and independence axiom of expected theo- Edward J. Russo 1969). Regularity is the ry requires consistency across linear transfor- fourth principle, but there is no logical con- mations of probabilities and thus is restricted nection between independence from irrele- to the domain of decisions under risk. In vant alternatives and regularity (one does contrast, we are interested in consistency not logically imply the other). Independence Rieskamp, Busemeyer, and Mellers: Extending the Bounds of Rationality 633

from irrelevant alternatives implies the fifth rationality must be pushed to accommodate principle of weak stochastic transitivity. basic and systematic regularities in human A primary goal of this paper is to assess the choice. degree to which preferential choices obey the principles. For each principle, we sum- 2. Perfect Consistency marize the empirical evidence. When are the violations serious and robust? When are they The standard utility theory of preference small or limited? By evaluating the types of begins by positing a choice set, X = {A, B, violations and the magnitude of those viola- C, ... }, with elements, A, B, and C, denoting tions, we can determine which principles the options. The choice might involve must be relaxed in descriptive theories of meals, movies, or mutual funds. A prefer- preferential choice. Furthermore, we show ence relation, >W is defined for all pairs of that many violations are related and suggest options, such that A >_pB means that option some underlying behavioral regularities that A is preferred to option B, or that the deci- should be addressed in a general descriptive sionmaker is indifferent. Finally, a set of theory. axioms is imposed on the preference rela- A secondary goal is to discuss theories of tions to permit the construction of a utility preferential choice in terms of the consisten- function u: X --+ R. The utility function cy principles. We will review classes of theo- assigns a real-valued utility to each option ries and discuss their compliance or lack of (e.g., u(A) is the utility assigned to option compliance with the five consistency princi- A), and the order implied by the is ples. When evaluating the different theories, used to predict the empirical preference we will consider the extent to which they can ordering, that is, u(A) > u(B) if, and only if describe empirical violations of the consisten- A p B. cy principles. We will also consider the com- Standard utility theories include the plexity of the theories. Selten (1991) argued expected utility theory ( that, if two models are equally successful in and 1947), rank- predicting existing data, the parsimonious dependent utility theories (e.g., John model that, in principle, is only able to pre- Quiggin 1982; Chew Soo Hong, Edi Karni, dict a small range of data should be preferred and Zvi Safra 1987; Menahem E. Yaari 1987; over the more model that, a priori, Jerry R. Green and Bruno Jullien 1989; R. can predict a larger range of data. In a similar Duncan Luce 1990), and multiattribute util- vein, Mark Pitt, In Jae Myung, and Shaobo ity theories (Ralph L. Keeney and Howard Zhang (2002) have argued that a model's gen- Raiffa 1976; Detlof von Winterfeldt and eralizability, or the ability of the model to pre- Ward Edwards 1986). All such theories are dict independent new data, is essential to based on the axiom of transitivity. model selection. A complex model might be Transitivity states that for any triad of superior to a simpler model when fitting options, A, B, and C, if two preferences hold, observed behavior, but, due to the danger of A >p B and B 0, C, then a third one should overfitting, the complex model might be pre- follow, A p C. This axiom is a cornerstone of dicting noise rather than systematic regulari- normative and descriptive theories because ties and, therefore, is less suitable for it implies that the utilities can be represent- predicting new independent behavior. ed as transitive, real numbers (Paul A third goal of this paper is to step back and Samuelson 1953; Edwards 1954, 1961). reconsider the of rationality. We dis- To test this axiom, one must measure pref- cuss the advantages and disadvantages of each erences. There are at least two commonly principle from different perspectives and used methods to "elicit" preferences-one address the question of how far the bounds of based on choices and the other based on 634 Journal of Economic Literature, Vol. XLIV (September 2006)

Judged prices, typically referred to as cer- participants typically change their minds in tainty equivalents. Unfortunately, these approximately 25 percent of choice pairs methods do not always agree. For example, (approximately equal in expected but when presented with a choice between two differing in risk) that are presented twice gambles, A and B, individuals may prefer (Colin F. Camerer 1989; John D. Hey and gamble A on the basis of choice, but state a Chris Orme 1994; Peter Wakker, Ido Erev, higher price for gamble B (for reviews, see and Elke U. Weber 1994; T. Parker Ballinger David M. Grether and Charles R. Plott and Nathaniel T. Wilcox 1997; Jerome R. 1979; and Sarah Lichtenstein Busemeyer, Ethan Weg, Rachel Barkan, 1983; Tversky, Slovic, and Xuyang Li, and Zhengping Ma 2000). 1990; Barbara A. Mellers, Shi-Jie Chang, In an extensive study of choice consisten- Michael H. Birnbaum, and Lisa D. Ordonez cy, Hey (2001) asked participants to make 1992). The psychological reasons for differ- one hundred choices between pairs of gam- ent preferences obtained with different bles that were repeated in five sessions. response modes are controversial, and there Participants were paid according to the out- is no real consensus about which method is come of the chosen gamble. Overall rates of better. Both methods of eliciting preferences inconsistencies varied greatly across partici- have pros and cons. Nonetheless, many pants, ranging from 1 percent to 21 percent. researchers prefer to use choices, and for On average, participants changed their this reason, we focus on choice1 (for an alter- choices 11 percent of the time from the first native perspective, see Luce, Mellers, and to the second session. Not a single partici- Chang 1993). pant had consistent preferences across the Differences in measured preferences first two sessions, while rates of inconsisten- occur not only across methods but also cy decreased slightly. Participants reversed across occasions within an individual. In a their choices 9 percent of the time from the classic study, Frederick Mosteller and Philip fourth to the fifth session.2 Virtually all of the Nogee (1951) demonstrated that individuals participants were inconsistent throughout were often inconsistent in their preferences the entire experiment. Only one of the fifty- for simple gambles over repeated occasions. three participants made identical choices in Participants were asked to accept or reject the last three sessions. binary gambles with fixed- proba- In sum, we have crossed the first bound of bilities and increasing amounts to win. The rationality. The literature is replete with tendency for an individual to select the gam- choice variability over time, contexts, and ble did not suddenly jump from zero to one occasions. To accommodate these results at an acceptable winning payoff. Instead, and those of many other studies, we must acceptance frequencies were a smooth relaxS- the assumption that choices are deter- shaped function when plotted against the ministic, and accept the inherent variability winning amount. The preference function of preferences. was strikingly similar to a psychometric func- tion, such as those derived from the discrim- 3. Strong Stochastic Transitivity ination of stimuli along psychophysical We now turn to probabilistic theories of continuua, such as loudness, heaviness, and choice (for an overview, see Anthony A. J. brightness. More recent evidence shows that

2 Note, however, that in Hey (2001) participants were 1 It is reasonable to assume that when a decisionmaker forced to select one gamble out of the pairs instead of also states his or her price or certainty equivalent for an option, allowing expressions of indifference. Thereby, inconsistent he or she will often cognitively perform a series of com- choices do not necessarily represent systematic or unsys- parisons between the option and candidate prices, tematic inconsistencies, but could be due to indifferences although these comparisons are not indispensable. between particular gambles. Rieskamp, Busemeyer, and Mellers: Extending the Bounds of Rationality 635

Marley 1992). These theories posit the exis- Philipp E. Otto 2006). However, this tence of a function that maps choice pairs into approach has not addressed the phenomena probabilities. Hereafter, p(AI{A,B}) denotes reviewed in this article. the probability of choosing A from the set A Random utility theories, representing the and B. Choice probabilities are often estimat- first class of strong probabilistic choice theo- ed by presenting an individual with the same ries, are defined as follows: Given a set of n choice on multiple occasions (with other options, each option, Ai, is assigned a ran- problems interwoven between repetitions). dom utility U(Ai), defined by a single joint Weak probabilistic choice theories imply that distribution function for all choice sets.3 either A is preferred to B if, and only if, Option Ai is always chosen when it has the p(AI{A,B}) > .50, or B is preferred to A if, and highest utility of the available options, and only if, p(BI{A,B}) > .50 (indifference holds in its choice probability is: the very rare case that p(AI{A,B}) = .50; for a review, see Luce 2000). Strong probabilistic (1) p(A) = p2[U(A,) > U(Aj), choice theories make precise predictions for allj = 1, 2,)...,n,j # i. about the probability mass of each option in A simplifying assumption commonly used in the set. conjunction with standard random utility There are two classes of strong probabilis- models is that the utility of an option can be tic theories-random utility theories and expressed as fixed utility theories (see Gordon Becker, Morris H. Degroot, and Jacob Marschak (2) U(Ai) = Ikk ' Xik+ i, 1963; Luce and 1965). where x are the k attributes of the options, / According to random utility theories, indi- are the weights assigned to the attributes, viduals select the option with the highest and e is an error. Random utility models tend utility using a deterministic decision rule and to differ according to the assumptions made variable utilities that differ over time and about the errors. If it is assumed that the contexts (Louis L. Thurstone 1959; Becker, error component is identically and independ- Degroot, and Marschak 1963; Tom ently distributed following an extreme value Domencich and Daniel L. McFadden 1975; distribution, the multinomial logit model Charles F. Manski 1977; McFadden 1981, results, which is the most common random 2001; Marley and Hans Colonius 1992; for autility choice model (Kenneth E. Train 2003, review, see Marley 1992). In contrast, fixed for an overview see McFadden 2001). utility theories (also called "simple scalabili- According to this model, the probability that ty theories") assume that individuals select option Ai is chosen is defined as: the option with the highest utility using a probabilistic decision rule and deterministic utilities. If a theory fulfills the property of simple scalability, each option in a choice set (3) xfk) can be assigned a scale value such that the choice probabilities are a monotone function 3 More formally, assume that X is a complete set of options and Y is a subset of X(Y CX), that is Y= {A,,...,A,,A) of the scale values of the options (Tversky and X= {A...An} with n 0 m. Define Ux= [U],U2....U,,] 1972a). A third approach assumes that both as a random utility vector with the probability distribution the utilities and the decision process are function F(u,....u,) = F(u) and density f(u,...,Un) =f(u). Then p(AIY) equals the probability of sampling a random deterministic, but choices are based on ran- utility vector Ux within the set Ai,= {lu e R" I ui, - u for domly selected strategies (see, e.g., Machina j= 1 ...m}. This probability equals the integral of the dis- 1985; John W. Payne, James R. Bettman, and tribution within the region A,: Eric J. Johnson 1993; and Robert Sugden 1995; J6rg Rieskamp and p(AIY) = dF(u) = ff(u)du. 636 Journal of Economic Literature, Vol. XLIV (September 2006) where x and 0 are defined in equation 2. The u(A) 0 u(B) and the assumption that F is an key assumption of the model is the independ- increasing function of the first argument. The ent error component, which implies that the class of fixed utility theories includes Luce's alternatives' utilities are also independent. (1959) choice model and many more recent Fixed utility theories, representing the theories. Although not all random utility the- second class of strong probabilistic choice ories do, Thurstone's (1927) Case V model theories, are probabilistic extensions of and the multinominal logit model described deterministic utility theories that share a above also imply strong stochastic transitivity property known as "simple scalability" (for proofs, see Luce and Suppes 1965). For a (Becker, Degroot, and Marschak 1963; Luce review, see Hey and Orme (1994) and David and Suppes 1965; Tversky 1972a). Fixed util- W Harless and Camerer (1994). ity theories imply that a single scale value can Does human choice behavior obey the be assigned to each option in a complete set principle of strong stochastic transitivity? An of options and that this value will be inde- overwhelming number of studies suggest pendent of the other options in the choice set otherwise. Many of the original experiments presented. For example, u(A) and u(B) can were carried out several decades ago (Clyde be real numbers that represent the utilities of H. Coombs 1958; Coombs and Dean G. options A and B. The choice probability for Pruitt 1961; David H. Krantz 1967; Tversky an arbitrary pair of options is determined by and Russo 1969; Lennart Sjoberg 1975, means of a function, F, ranging from zero to 1977; Sjoberg and Dora Capozza 1975). In one, where: one study (Coombs and Pruitt 1961), partic- ipants were asked to choose between mone- (4) p(AI{A,B}) = F[u(A),u(B)]. tary gambles with different variances. In 25 F is strictly increasing in the first argument percent of all possible triplets, choices vio- and strictly decreasing in the second, and lated strong stochastic transitivity. In anoth- F[u(A), u(B)] = .50 when u(A) = u(B). All er study of choices between gambles with simplefixed utility theories imply strong sto- cigarettes as outcomes that were given to the chastic transitivity, which is a natural exten- participants, Betty J. Griswold and Luce sion of the deterministic axiom of transitivity (1962) found strong stochastic transitivity (see Edwards 1961). violations in 26 percent of all triplets. The consistency principle of strong sto- Donald L. Rumelhart and James G. Greeno chastic transitivity states that for any arbitrary (1971) presented participants with pairs of celebrities and asked them to select the triplet of options, A, B, and C: celebrity with whom they would prefer to If p(AI{A,B}) 0 .50 and p(BI{B,C}) > .50, have a one-hour conversation. Violations of strong stochastic transitivity appeared in 46 then p(A|{A,C}) 0 max{p(AI{A,B}), percent of the possible triplets. p(Bi{B,C})}. Mellers and Karen Biagini (1994) The first condition, p(AI{A,B}) 0 .50, reviewed the literature on strong stochastic implies that u(A) 0 u(B). The second con- transitivity violations and argued that the dition, p(BI{B,C})>.50, implies that presence or absence of such violations u(B) 0 u(C). By the transitivity of real depends on the comparability of options numbers, u(A) 0 u(C), the first conse- (Krantz 1967; Tversky and Russo 1969). quence, p(Al{A,C}) 0 p(Al{A,B}1), follows When people compare multiattribute from u(B) 0 u(C) and the assumption that options, the similarity of values along one F is a decreasing function of the second attribute enhances differences on other argument. The second consequence, attributes. These comparability effects are p(AI{A,C})> p(BJ{B,C}), follows from easily linked to cognitive processing. When Rieskamp, Busemeyer, and Mellers: Extending the Bounds of Rationality 637 evaluating options, people pay more atten- Consider a choice between two gambles, tion to attributes that differ. Consider, for where gamble A has some probability, pA, of example, two jobs with comparable salaries, winning xA, and gamble B has some probabili- but different commuting times. Due to the ty, PB, of winning xB. The probability of select- similarity of salaries, people place greater ing gamble A over B is the difference between and importance on differences in the products of utilities and subjective commuting times. probabilities as follows: Comparability effects imply that viola- tions of strong stochastic transitivity should (5) p(AI{A,B}) = follow a specific predictable pattern. Mellers J[U(XA)aS(pA),l-U(XB)aS(pB)'], and Biagini (1994) reanalyzed several data sets and found strong experimental evi- where ] is a response function (e.g., a dence. To illustrate the effect, consider the cumulative logistic function), and a and P three gambles shown in table 1. Gamble A are weights associated with utilities and offers a 29 percent chance of winning $3, subjective probabilities, respectively. The otherwise $0, gamble B has a 5 percent weight associated with the utilities, a, chance of winning $17.50, otherwise $0, and depends on the difference between the gamble C has a 9 percent chance of winning subjective probabilities (i.e., S(pA) - s(pB)). $9.70, otherwise $0. Gambles are matched The bigger the probability difference, the in expected values. Furthermore, gambles B smaller the value of a. The same logic holds and C have relatively similar levels of proba- for P; the bigger the difference between bility (i.e., 5 percent and 9 percent). the utilities, the smaller the weight Mellers, Chang, Birnbaum, and Ordonez attached to probabilities. Similar values on (1992) found that A was preferred to B, one attribute enhance differences on other p(A|{A,B}) = .59, and B was preferred to C, attributes. p(B|{B,C}) = .72. Strong stochastic transitivi- The stochastic difference model (Claudia ty implies that the preference for A over C Gonzalez-Vallejo 2002) is another psycholog- should be at least as high as .72. Instead, ical model that can explain violations of p(AI{A,C}), the preference for A over C was strong stochastic transitivity by assuming that only .51. This finding was observed with and options are compared intradimensionally. without financial incentives (i.e., paying par- For each dimension, the model determines ticipants the gambles' outcomes). One could the proportional advantage or disadvantage argue that either the preference for B over of an option compared with the other option. C was "too strong" or the preference for A For example, if gamble A's outcome is larger over C was "too weak." Mellers and Biagini than gamble B's outcome (i.e., XA > XB), the argued that the similarity of probability lev- proportional advantage of gamble A is com- els for gambles B and C (5 percent and 9 puted as (xA - XB)/XA. If the winning probabil- percent, respectively) increased the per- ity of gamble A is smaller than the winning ceived differences in payoffs ($17.50 and probability of B, this represents a propor- $9.70, respectively) making B seem much tional disadvantage defined as (PB - PA)/PB. better than C.4 The disadvantage is subtracted from the Mellers and Biagini (1994) proposed a advantage providing a proportional differ- contrast-weighting model to explain this pat- ence d. The decisionmaker prefers the first tern of strong stochastic transitivity violations. option, when d is larger than (6+ e), where 6 is a personal decision threshold parameter and E is a normal distributed error with mean 4 In their paper, Mellers et al. (1992) presented strength of preference judgments, which were converted zero and a standard deviation, o-, as a free to choice proportions for this review. parameter. 638 Journal of Economic Literature, Vol. XLIV (September 2006)

TABLE 1 GAMBLES USED BY MELLERS ET AL. (1992) TO ILLUSTRATE COMPARABILITY EFFECTS

A B C

Probability of Winning .29 .05 .09 Amounts to Win $3.00 $17.50 $9.70

4. Independence from Irrelevant Alternatives (6) p(Aj{A,B}) = p(AIS) Another principle implied by fixed utility where S is any possible subset of T including theories, is the independence from irrelevant A and B. alternatives (IIA). This principle states that This stronger version does not only the relative preference for two options, such require that the order of the choice proba- as A and B, should not vary when each is bilities for A and B be independent, but also evaluated with respect to option C or D (see that the ratios of the choice probabilities be and Marschak 1954; Tversky independent of the option sets. and Russo 1969; Tversky 1972a).5 have commonly focused More formally, this principle states that upon testing the weaker and more general for any arbitrary quadruple of options, A, B, version of the IIA principles, which of course C, or D, the following property must hold: implies violations of the stronger independ- ence principle. Violations of the weak version If p(AI{A,C}) 0 p(BI{B,C}), are more remarkable since they represent then p(AI{A,D}) 0 p(B|{B,D}). changes in the order of preferences, whereas The condition on the left implies that violations of the strong version may be u(A) 0 u(B), and that this order implies the caused by only slight changes in the strength consequence on the right. There is an equiv- of preferences. Therefore the focus of our alence relation between strong stochastic review is on the weak version. Many studies transitivity and the IIA. Tversky and Russo show that people violate the weaker form of (1969) proved that one principle is satisfied the independence principle. Busemeyer and if, and only if, the other is satisfied. Tversky James T. Townsend (1993) reviewed this lit- (1972a) also proved that the IIA is sufficient erature and argued that these violations were for guaranteeing the construction of a fixed driven by the same comparability effects that utility theory. lead to violations of strong stochastic transi- There is a stronger version of the IIA tivity. For example, Busemeyer (1985) inves- principle economists often refer to: tigated the independence axiom in According to Luce (1959), this principle sequential choices between a gamble and a states that when having an option set T con- "sure thing." Participants, who were paid taining the options A and B the following according to the gambles' outcomes, had to property must hold: choose between gambles: There was a low- risk gamble, A, (normally distributed with 5 Also called "property alpha" by Sen (1993); for an mean zero and standard deviation 5() and a overview of the different notions, see lain McLean (1995). high-risk gamble, B, (normally distributed Rieskamp, Busemeyer, and Mellers: Extending the Bounds of Rationality 639 with mean zero and standard deviation 500). This preference pattern violates fixed utility There were also two sure things: C was-10 theories for the following reason: The initial and D was 10. Participants preferred A to C, binary choice between Beethoven and and B to C. Furthermore, p(AI{A,C}) was Debussy implies that F[u(B), u(D)] > .50 and, greater than p(BI{B,C}) (i.e., p(AI{A,C}) = .75, thus, u(B) >u(D). The advantage of and p(Bi{B,C}) = .56). According to the inde- Beethoven over Debussy should continue in pendence principle, p(AI{A,D}) should be the presence of a second Beethoven record- greater than p(BI{B,D}). However, the oppo- ing. That is, with three options, site order appeared (i.e., p(AI{A,D}) =.30, p(BI{B,D,B'}) = F[u(B),u(D),u(B')] > F[u(D), and p(BI{B,D}) = .50). u(B),u(B')] =p(DI{B,D,B'}) because F is an Busemeyer and Townsend (1993) argued increasing function of the first argument and that choice probability between two gam- a decreasing function of the remaining bles, A and B, is an increasing function of the arguments. But that is not what happens. ratio d = (u(A) - u(B))/o-, where the numer- Tversky (1972b) called this violation of the ator is the expectation of the differences independence principle the similarity effect. between the random utilities of the gambles, Debreu's example has been frequently and the denominator is the standard devia- referred to and adapted to many applications tion of the differences between the random in economics. For instance, the "red-bus utilities of the gambles. The standard devia- blue-bus problem" represents a famous tion, oa, determines how easy or difficult it is adaptation to research on transportation and to discriminate the mean difference. Choices travel behavior (e.g., Train 2003). Here a between the low-risk gamble and the sure traveler has the choice between a car and a thing are easy to discriminate because the blue bus for traveling to work. Now a second standard deviation, o0, is small; whereas bus, which only differs from the blue bus in choices between the high-risk gamble and its red color, is added to the option set. the sure thing are difficult to discriminate When now choosing among the three because the standard deviation, oa, is large. options, decisionmakers might split their Violations of the independence principle preferences for buses between the red bus have also been found when choice sets con- and the blue bus, violating the IIA principle. tain more than two options. A famous exam- The most common random utility theory, ple originally described by Gerard Debreu the multinomial logit model described (1960) illustrates this case. Envisage a music above, can not account for these violations lover who makes choices between classic because it assumes that the errors are statis- recordings. When presented with a choice tically independent. The joint distribution of between a recording of Beethoven's Eighth utilities for the set of options Y is assumed to Symphony (option B) and a recording of the be identically and statistically independent Debussy quartet (option D), the music enthu- for all choice sets. Since these independence siast has a higher probability of choosing violationsB. seem intuitively reasonable, the Then a different recording of Beethoven's insufficiency of the multinomial logit model Eighth Symphony by the same orchestra, but to explain them has initiated an extensive with a different conductor is added to the development of new versions of random util- choice set (option B'). When evaluating all ity theories. These new versions of random three options, the music enthusiast has the utility theories, predominantly proposed by highest probability of choosing the recording economists, allow correlations in the joint of the Debussy quartet. The probability of distributions of utilities (for an overview, see choosing a recording of the Eighth Symphony Robert J. Meyer and Barbara E. Kahn 1991; is now divided between the two recordings of McFadden 2001). These theories have been the Eighth Symphony, and thus is smaller. successfully applied to describe choices in a 640 Journal of Economic Literature, Vol. XLIV (September 2006) variety of domains, including transportation alternatives (David A. Hensher 1986; (7) Uq(Ai)= Lk + Eiq5 McFadden 1974), occupations (Peter where xikq are explanatory variables, Pkq are Schmidt and Robert P. Strauss 1975), or the weights for each variable that is automobiles (Steven T. Berry 1994; Berry, assumed to be a random variable following a James Levinsohn, and Ariel Pakes 1999). particular form, such as a normal distribu- Three important classes of random utility tion with a particular correlation matrix, 1, models can be distinguished: generalized and eiq are error components independently extreme value models, multinomial probit or and identically extreme value distributed. Thurstone models, and mixed logit models On the one hand, the weights are allowed to (for an overview, see William H. Greene vary across individuals to capture the het- 2000 or Train 2003). erogeneity of individuals' preferences; on The most prominent model of the general- the other hand, the weights are assumed to ized extreme value models (McFadden 1978) be stable within an individual to permit cor- is the nested logit model, which assumes that relations between utilities. These correla- the choice set can be partitioned into subsets, tions can capture similarity effects, and called nests. The error components in equa- thereby violations of IIA. tion 2 may be correlated within each subset To illustrate how the mixed logit model can but are uncorrelated across nests. With this predict similarity effects, consider the follow- assumption, the IIA principle holds for all ing problem. Suppose two different types of options within a nest but not for options across cars share the same share of the automobile nests. These models can explain particular market. The first car, A, is rather expensive, violations of IIA. but very luxurious, whereas the second car, The second class of random utility models, B, is cheap, but offers only the standard multinomial Thurstone or probit models, equipment. Mixed logit models can explain including Richard D. Bock's and Lyle V. the equal share by assuming that the weights Jones's (1968) Thurstone model, and Jerry A. given to price and comfort vary across con- Hausman's and David Wise's (1978) condi- sumers. Half of the consumers might give a tional probit model, can also explain viola- relatively high weight to the price, whereas tions of the IIA. For these models, the error the other half gives a relative low weight to components in equation 2 are assumed to be the price, so that different utilities for the two drawn from correlated normal distributions. cars, but similar market shares, result. Now Dependencies between these distributions suppose a new car, C, enters the market. It is capture correlations between the options. cheap and similar to car B. Consumers who These models are less restrictive than the place a relatively high weight on price will nested logit models and allow the prediction find this car appealing, whereas the other of "any pattern of substitutions among consumers who preferred car A will not be options" (Meyer and Kahn 1991, p. 98). attracted to the new car. Therefore, the mar- The third and most recent class of random ket share of the cheap car B will decrease due utility models is the mixed logit models, also to the market entry of the second cheap car called random coefficient models (Berry C, whereas the market share of the expensive 1994; Berry, Levinsohn, and Pakes 1995, car A stays stable, which represents a IIA vio- 1999; Hensher and Greene 2003; David lation. Thus, heterogeneity across individuals Revelt and Train 1998). Mixed logit models can produce correlations among the cars assume that the weights, /, of the explanato- utilities. Moreover, the distributions from ry variables k are random parameters that which the weights are drawn can be correlat- vary across individuals q. According to these ed, so a variety of similarity effects can be models, the utility of an option is: represented. Rieskamp, Busemeyer, and Mellers: Extending the Bounds of Rationality 641

To explain specific similarity effects, it can for making predictions about choices if new be helpful to represent the mixed logit options are added to the option set (see model in another form that is mathematical- Meyer and Kahn 1991). In general, the ly equivalent to equation 7, but decomposes recent random utility theories have been the random weights fiqk as follows: criticized as being "black box" models that do not specify the psychological process of (8) Uq(Ai) = kk Xikq+ +Xkkq" Z'ikq +,iq' choice (Eliahu Stern and Harry W. where ak are fixed weights, ykq are random Richardson 2005). weights with a mean of zero, and Xikq and Zikq Psychologists have addressed similarity are explanatory variables. Since the variables effects, such as the violations of the IIA, by Zikq are multiplied by weights with a mean of developing descriptive theories. Tversky zero, the sum of the explanatory variables (1972b) proposed a probabilistic choice with the random weights, Ykq, and the error theory called elimination by aspects (EBA). components, Eiq, represent the error compo- The theory makes the crucial assumption of nent of the utility of option i. Notice that, attention switching, which means that unlike the error components, e,, the ran- the decisionmakers' attention randomly dom weights, y'kq, can vary across individuals switches between the attributes of the but not across options. Depending on the available options (i.e., their aspects). For values of Zikq, options' utilities can be corre- instance, in the "red-bus blue-bus-prob- lated. For instance, if an explanatory vari- lem" the decisionmaker might consider able, Zikq, is a dummy variable that reflects convenience and environment friendliness membership in a nest, the mixed logit model as two relevant attributes, with the car as can approximate the predictions of nested having the aspect of being convenient and logit models. McFadden and Train (2000) the busses as having the aspect of being have shown that, under quite general environment friendly. When the decision- assumptions, any random utility model can maker's attention first focuses on the con- be approximated by a mixed logit model. venience, the two buses are eliminated The mixed logit models and the other from further consideration and the car is newer versions of the random utility models chosen. When, however, first giving consid- are relatively complex when compared to eration to the environment friendliness the multinomial logit model, so that compu- leads to the car being eliminated, after- tational problems limited their popularity. wards one of the two buses is chosen. In But now, since it is easier to do parameter this process the blue bus splits its share estimations, these models are making a with the red bus, so that the chances of comeback. However, the lack of parsimony selecting one of the two buses are smaller makes it difficult to interpret the parame- than the chance of choosing the car. ters, and the models are rather ad hoc (e.g., More generally, let T be a three-alterna- membership in different nests for the nested tive set, T = {A,B,C}. Each option has aspects logit models), and therefore, do not shed associated with it, some shared and some much light on why the similarity effects unique. Figure 2 illustrates a special case. occur. If subsets of options for different Let K be the sum of the utilities of all of the choices can not be defined a priori, one aspects, unique aspects u(a), u(b) and u(c), should not generalize the predictions of the and shared aspects u(ab), u(ac) and u(bc). model to new choices. Likewise, when fit- The probability of selecting A from the set, ting a mixed logit model to data, one can T, is: describe different types of similarity effects, but it is often difficult to predict the effects (9) p(AIT) = [u(a) +u(ab) . p(AI{A,B}) + a priori. Thus, these models are less suitable u(ac) . p(AI{A,C})]/K. 642 Journal of Economic Literature, Vol. XLIV (September 2006)

Figure 2. An Alternative Set Where Alternatives Share Aspects

Equation 9 shows the three ways that A that the weights are fixed for each individual could be chosen. First, a could be selected but vary across individuals. from the set of aspects with probability The compromise effect (Itamar Simonson u(a)/K. If so, A would be chosen with prob- 1989; Tversky and Simonson 1993) is another, ability 1. Second, ab could be selected with more recently observed violation of inde- probability u(ab)/K, and then A would be pendence with multiple alternatives. Let T be chosen over B with probability a three-option set, T = {B,C,D} each of which p(AI{A,B}) = (u(a) +u(ac))/(u(a) +u(ac) + is described by two attributes. Option B has u(b) + u(bc)). the most advantageous attribute value for one attribute, whereas option D has the most Finally, ac could be selected with probabili- advantageous attribute value for the other ty u(ac)/K, and A would be chosen over C attribute. Option C has attribute values that with probability lie between the attribute values of option B p(A I{A,C}) = (u(a) +u(ab))/(u(a) +u(ab) + and D along both attributes. When presented u(c) + u(bc)). with only two options, B and C, option B is The attention switching assumption of the more popular so that p(BI{B,C}) 0 elimination by aspects theory resembles the p(C|{B,C}); however when presented with all mechanism the mixed logit model is using to three options, the relation between B and C explain IIA violations since both theories reverses so that p(CI{B,C,D}) > p(BI{B,C,D}), assume that the weights of attributes are not thus violating IIA. fixed. The distinction is that the elimination Simonson demonstrated the effect in a by aspect theory assumes that weights vary study in which participants made choices due to attention fluctuations of each individ- among cameras. One camera was high in qual- ual, whereas the mixed logit model assumes ity and price, and the other was low in quality Rieskamp, Busemeyer, and Mellers: Extending the Bounds of Rationality 643 and price. The third option, called the com- so that vk(A) -vk(B) is negative, then promise, had moderate values on both disk(A,B) = 8[vk(B) -vk(A)], or otherwise dimensions. When individuals were present- zero. The theory makes the crucial assump- ed with binary choices, participants were tion that 3 is a convex function incorporating roughly indifferent between the two options. . Due to the loss aversion However, when given a choice among all assumption, the disadvantage of A over B is three cameras, individuals chose the com- larger than the corresponding advantage of promise camera more often than either of B over A. In the compromise example, the the two extremes. This finding violates the compromise alternative C has small advan- independence principle. tages and disadvantages relative to B and D, To explain the compromise effect, whereas both B and D have relatively large Tversky and Simonson (1993) developed advantages and disadvantages with respect the componential context theory, which to each other. Due to the convex loss aver- requires a dimensional representation of the sion function, the large disadvantages of B options in a multidimensional space. In this and D make them less appealing, whereas theory, Tversky and Simonson distinguished due to C's small disadvantages, it becomes between the background context defined by the most attractive alternative. prior choice sets and the local context Interestingly, the elimination by aspects defined by the immediate choice set. If the theory (Tversky 1972b) fails to predict com- background context is held constant, then promise effects (see Appendix A for a proof), an option, A, is selected from a set, T, if it and the componential context theory fails to maximizes: explain similarity effects (see Appendix B for a proof). Neither theory provides a general (10) V(AIT) = and coherent account of independence vio- IkPkvk(A) + 01BTR(AB) lations. Nevertheless, the theories demon- strate that fixed utility theories in which where the overall value of choosing options A from are evaluated independently of other set, T, depends on the global context optionsand the are not suitable for explaining the immediate context. The first term on the competitive effects of options we reviewed. right side of the equation reflects the global Eldar B. Shafir, Daniel N. Osherson, and context and is the sum of the products of Edward E. Smith (1993) have argued that, to each value vk associated with A along each explain these competitive effects, a compara- dimension k times the relative contributions tive approach of decision theories is of that dimension. The second term that required. Theories following this compara- reflects the immediate context is 6, the rela- tive approach, like the elimination by aspects tive contribution of the immediate context theory or the componential context theory, times the sum of the advantages and disad- incorporate comparison processes among vantages of A relative to other options in the options in the evaluation process. Shafir, choice set. The relative advantage of A over Osherson, and Smith (1993) proposed an B on dimension k is defined as: advantage model, which also takes into (11) R(A,B) = account comparisons among options. However, since the theory is restricted to pairwise choices and is defined as a deter- ministic theory, it cannot handle the compet- Xkadvk(A,B) itive effects of option sets with more than two for which advk(A,B) = vk(A) - Vk(B) if vk(A) - options or the probabilistic nature of choice. vk(B) is positive or otherwise zero. If option In sum, two more bounds of rationality A has a disadvantage over B on dimension k have now been crossed. There are numerous 644 Journal of Economic Literature, Vol. XLIV (September 2006) studies and robust examples of how and to the attribute levels of the new option that when people violate strong stochastic transi- is added to the set. tivity and the closely related principle of IIA. Suppose people are indifferent between These consistency principles must be relaxed two options, A and B. The "decoy" option, D, to account for human choice behavior. is then added to the choice set. D is said to be "asymmetrically dominated" if it has val- ues that are worse than those of another 5. Regularity option, such as B, along one attribute and its values are no better than those of B on all Random utility theories obey a property of choice known as regularity. This principle other attributes. Asymmetric dominance asserts that the addition of an option to a effects occur if the probability of choosing B increases when D is added to the choice set. choice set should never increase the proba- bility of selecting an option from the original Attraction effects are similar to asymmetric set. More formally, assume that X is a com- dominance effects, but D is not asymmetri- plete set of options and Y is a subset of cally dominated by B. Instead, D simply X(YCX), that is Y= {A,...A,,} and seems a lot worse. These effects are illustrated with the X = {A,...,A,} with n 0 m. If Y is presented for choice, and A, is a member of both Y and options in table 2. Suppose a decisionmaker X, then the probability of choosing Ai from is indifferent between two laptop computers, the set Y is always equal to or greater than A and B. A is relatively heavy, but has a large the probability of selecting A, from the set X,display. B is lighter, with a smaller display. Now the decisionmaker considers a third that is, p(Ai IY) - p(A IX). This property holds for random utility theories since it is less like- computer, D that is heavier than B, but has ly that Ai has the highest utility in a larger set,the same sized display. The mere presence of X, with n options than in a smaller set, Y, with D makes B seem better. The probability of m options: selecting B from the set of three computers (A, B, and D) is greater than the probability p(AiIY) = p(U(Ai) = max{U(Al),...,U(Am)}) of selecting B from the set of two computers > p(U(Ai) = max{U(A1),...,U(A,,)}) x (A and B). This increase in probability is the p(U(Ai) = max{U(A),...,U(A)}U(Ai) = asymmetric dominance effect. Now suppose that the decoy, D, is not asymmetrically max{ U(A1),..., U(Am,,) = p(A,IX). dominated by B, yet it still seems inferior to That random utility theories obey the regu- B (e.g., the display for D is 13.2 inches, larity principle also holds for newer versions rather than 13 inches). Here, the increase in of random utility theories, including the the probability of B with the inclusion of D is mixed logit model (see Appendix C for called a the attraction effect. In both cases, the proof). result is a violation of regularity. Do people obey the principle of regulari- Asymmetrical dominance effects have ty? Many studies show robust violations of been found in choices among gambles this principle. The addition of a new option (Douglas H. Wedell 1991), consumer prod- can increase the choice probability for an ucts and services (Wedell and Jonathan C. option in the original set. Such violations are Pettibone 1996), job applicants (Scott called either asymmetrical dominance Highhouse 1996), and political candidates in effects (Joel Huber, Payne, and Christopher U.S. elections (Yigang Pan, Sue O'Curry, and Puto 1982) or attraction effects (Huber and Robert Pitts 1995). The effects occur with Puto 1983; for an overview, see Timothy B. choices and other measures of preference Heath and Subimal Chatterjee 1995). The ( and Thomas S. Wallsten 1995; difference between these two effects refers Sanjay Mishra, U.N. Umesh, and Donald E. Rieskamp, Busemeyer, and Mellers: Extending the Bounds of Rationality 645

TABLE 2 ILLUSTRATION OF ASYMMETRIC DOMINANCE EFFECTS WITH CHOICES AMONG LAPTOPS

A B D

Weight 6 lbs 3 lbs 3.5 lbs Display Size 15 ins 13 ins 13 ins

Stem 1993; Pan and Donald R. The Lehmann theory postulates that the decision- 1993; Tversky and Simonson maker1993). forms Finally, preference states for options the effects are highly robust in that both are the within- result of previous preference subject designs (repeated choices states plus theby currently the retrieved utilities same individuals) and between-subject for options minus negative influences of designs (different choices across alternative groups options of in the choice set. If a individuals). preference state for an option crosses a Random utility theories cannot account threshold, the decisionmaker selects that for violations of regularity because the joint option. During the dynamic process of com- distribution of utilities for the option set, Y, paring the alternatives, attention switches is assumed to remain the same when options from one attribute to another; thus, the the- are added to create a larger set, X. However, ory also includes an attention switching the empirical evidence suggests that the assumption as was proposed by the elimina- joint distribution of utilities for the larger set tion by aspect theory (Tversky 1972b), which changes depending on the options that are allows the explanation of similarity effects. included in the set. A psychological theory In addition, the theory assumes that options that specifies the cognitive process of choice compete with each other (in the case where and can explain dependencies among ci > 0) and similar options compete more options has been proposed by Busemeyer heavily than dissimilar options by exerting and Townsend (1993) and Robert M. Roe, stronger negative influences. These compet- Busemeyer, and Townsend (2001). It is itive effects can predict violations of regular- called decision field theory. According to this ity (see Roe, Busemeyer, and Townsend theory, choices are the result of a dynamic 2001; Busemeyer and Adele Diederich process during which the decisionmaker 2002). The theory also predicts that these retrieves, compares, and integrates random violations of regularity should increase when utilities over time. During the deliberation decisionmakers deliberate longer, a predic- period, a random utility is retrieved for each tion that has been supported in several option, Ai, at each moment in time t, denot- experiments (Simonson 1989; Wedell 1991; ed Ut(Ai). These utilities are integrated Ravi Dhar, Stephen M. Nowlis, and Steven across time by a linear dynamic process to J. Sherman 2000). form a preference state: Another psychological theory that can (14) Pt(A) = s. Pt_(A) + Ut(Ai) - explain dependencies between options is a nonlinear adaptive network model called the _IJ(Aj),#j leaky, competing accumulator model (Marius where s is a weight that reflects the memory Usher and James L. McClelland 2004). This of recent preferences. The coefficients, theory is very similar to decision field theory ci= cji, are a function of the distances in the sense that it also assumes that prefer- between the options in a multidimensional ences develop in a dynamic manner over attribute space. time. Likewise, it includes the attention 646 Journal of Economic Literature, Vol. XLIV (September 2006)

TABLE 3 GAMBLES USED BY TVERSKY (1969) To DEMONSTRATE INTRANSITIVE PREFERENCES

A B C D E

Probability of Winning 7/24 8/24 9/24 10/24 11/24 Payoff $5.00 $4.75 $4.50 $4.25 $4.00

switching assumption to explain similarity weak stochastic transitivity (see Appendix E effects. In contrast to decision field theory, it for a proof). assumes loss aversion to explain regularity It is interesting to look closely at a classic violations. Thus, it combines the attention study by Tversky (1969) to understand the switching assumption of the elimination by conditions under which violations of weak aspect theory (Tversky 1972b) with the loss stochastic transitivity can occur. Tversky aversion assumption of the componential asked participants to choose between pairs of context theory (Tversky and Simonson 1993) gambles, as shown in table 3. Gambles varied to explain IIA and regularity violations. in probabilities and payoffs, but expected values were similar. When presented with 6. Weak Stochastic Transitivity gambles having similar probabilities, some participants frequently preferred the gamble Weak stochastic transitivity is perhaps the with the larger payoff. That is, they preferred weakest bound on rationality of the five prin- A to B, B to C, C to D, and D to E. But when ciples that we consider. This principle states presented with gambles that substantially that for any arbitrary triple of options, A, B, and C: differed in probabilities, they preferred the gamble with the greater probability of win- If p(AI{A,B}) > .50 and p(BI{B,C}) 0 .50, ning. That is, they preferred E to A. Tversky's then p(AI{A,C}) 0 .50. results were later replicated by Harold R. Weak stochastic transitivity lies at the heart Lindman and James Lyons (1978) and of weak probabilistic choice theories. extended by Jonathan W Leland (1994). Those theories based on standard utility Tversky (1969) proposed an additive dif assumptions must satisfy the principle ference model based on the concept of a just because u(A) 0 u(B) 0 u(C) -4 p(A|{A,C}) noticeable difference (Luce 1956, p. 180) to > .50. Furthermore, many strong probabilis- account for the violations. A just noticeable tic choice theories must also satisfy weak difference is the amount of change required stochastic transitivity, including all simple in a physical before people detect a scalable or fixed utility theories, elimination difference between similar stimuli on some by aspects theory, and decision field theory. percentage of occasions. Tversky argued that The contrast-weighting model and the sto- differences in probabilities for the adjacent chastic-difference model can, however, gambles (i.e., A and B, or B and C) were less account for violations of weak stochastic than a just noticeable difference, or not suffi- transitivity. Among the random utility mod- ciently great for people to differentiate els considered here, only the mixed logit between gambles. The additive difference model allows violations of weak stochastic model is a noncompensatory, lexicographic transitivity (for an example, see Appendix semi-order A model is noncompensatory if D). In contrast, the other random utility the information from one dimension cannot models, like, for instance, the generalized be compensated by any combination of infor- extreme value model, obey the principle of mation from other less important dimensions Rieskamp, Busemeyer, and Mellers: Extending the Bounds of Rationality 647

(see also Rieskamp and Ulrich Hoffrage especially when dimensions differ in meas- 1999). In this account, decisionmakers select urement units (Tversky 1969). Finally, the gamble with the highest value on the intradimensional comparisons simplify the most important dimension, but if the differ- detection of dominance relationships. ence between gambles on that dimension is Another class of theories that allow intran- too small to be detected, decisionmakers sitive choices is regret theories (David E. Bell consider the next most important dimension 1982; Peter C. Fishburn 1982; Loomes and and choose the gamble with the highest value Sugden 1982). These theories assume that on that dimension, and so on. the decisionmaker prefers the option with (1988) and Leland (1994) the highest utility but the utility of an option offer a similar account to describe choices depends on an independent evaluation of the between gambles. They argue that people option and a comparative evaluation with the follow a sequential decision process in which other option as follows: they first check whether one gamble domi- (15) u(Ai) = Ep, [u(x) + nates the other. Then they compare the probabilities across gambles and outcomes w .- R(u(xs),U(Xsj)), across gambles. If levels of either attribute where Ai is the option under consideration, are perceived as being similar on one dimen- Aj is the alternative option, s refers to states sion and dissimilar on the other dimension, with consequences x, and where w = 1 if the dissimilar dimension becomes decisive. u(xsi) > u(xsj), and w -=-1 otherwise. The Only after these steps are completed nonlineardo function R(u(xsi), u(xj)) is increas- decisionmakers move to an unspecified third ing in its first argument and decreasing in its step of selecting a gamble. This account, alsosecond argument; a possible function could based on perceived similarity, predicts thebe R[u(xs), u(xsj)] = [u(xs) --U(Xsj)]. intransitive choices described above. The To illustrate, consider three pairwise contrast-weighting model offered by Mellers choices between the gambles A, B, and C, and Biagini (1994) is a more precise version with outcomes determined by throwing a of this idea in which the effects of similarity die. Gamble A has the smallest payoff of $2 and dissimilarity are reflected in the weights with certainty, gamble B has a medium pay- associated with the contrasts. off of $3 if the die throw results in a one, The idea of a noncompensatory decision two, three, or four, and option C has the process that focuses on one dimension at a highest payoff of $5 if the die throw results time combined with a just noticeable differ- in a five or six. In a choice between A and B, ence is a reasonable explanation for such vio- A has the higher utility because the antici- lations, especially when options differ along pated regret associated with B if the die many dimensions and the differences vary comes up five or six is disproportionately substantially. It is efficient to place greater large relative to the regret associated with A attention on dimensions with larger differ- if a one, two, three, or four were thrown. ences and ignore or discount dimensions with Furthermore, the decisionmaker will prefer smaller differences (Mellers and Biagini B to C using the same logic. However, in a 1994). Furthermore, theories that permit vio- choice between A and C, the decisionmaker lations of weak stochastic transitivity (Tversky will prefer C, because the anticipated regret 1969; Mellers and Biagini 1994; Gonzalez- with A if a one or two is thrown is extremely Vallejo 2002) assume that decisionmakers high. As this examples shows, regret theories compare options intradimensional-wise, that can explain intransitive choices, but they is, focusing on each dimension sequentially. cannot explain compromise or attraction Intradimensional comparisons are usually effects, nor can they handle the probabilistic easier than interdimensional comparisons, nature of choice. 648 Journal of Economic Literature, Vol. XLIV (September 2006)

How common are violations of weak sto- choice behavior in many, perhaps even most, chastic transitivity? John M. Davis (1958) situations. found some violations but decided they were nonsystematic. Donald Davidson and 7. Discussion Marschak (1959) studied preferences for bets and found that 7 percent to 14 percent The goal of the present review is to exam- of choice triples violated transitivity. ine how, when, and why people violate con- Kenneth R. MacCrimmon (1968) examined sistency principles that embody the the choices of thirty-eight business man- traditional view of rationality. Five con- agers and found that eight exhibited some straints on preferential choice were intransitivities, although the violations didaddressed. Those constraints, or consistency not appear to be reliable. William T.principles, are partially ordered on a contin- Harbaugh, Kate Krause, and Timothy R.uum from strictest to weakest. Figure 1 Berry (2001) examined transitivity in chil- shows that the principles include perfect dren and found relatively few violations, consistency, strong stochastic transitivity, although many inconsistencies. The intransi- IIA, regularity, and, finally, weak stochastic tivities found by Tversky (1969) were based transitivity. The evidence for and against on a preselected group of participants for each principle was examined. The empirical whom inconsistent behavior had been effects, their connections to principles, and observed before. In a recent summary briefof the definitions of the effects are presented literature, Luce (2000) pointed out in tablethat 4. The frequency and breadth of the intransitivities may occur, but it may violationsbe due declined as we proceeded from to the simple fact that choices are not stricteralways to weaker bounds. There was strong consistent (see also Jean Claude Falmagne evidence against perfect consistency, strong and Geoffrey Iverson 1979). stochastic transitivity, independence, and Other researchers have found systematic regularity. This evidence is important violations of weak stochastic transitivity because it implies that descriptive theories (Henry Montgomery 1977; Robert must relaxH. these constraints in order to make Ranyard 1977; David V. Budescu and accurateWendy predictions. Weiss 1987; Birnbaum, Jamie N. Patton, and 7.1 Comparison of Models Melissa K. Lott 1999; Lindman and Lyons 1978; Peter Roelofsma and Daniel Read Deterministic theories of choice, such as 2000). However, these violations-in stark the subjective expected utility theory contrast to violations of strong stochastic (Leonard J. Savage 1954) or multiattribute transitivity-are relatively rare. That is, utility theory (Keeney and Raiffa 1976), can- there are reliable circumstances where they not accommodate the fact that choices are occur, but those circumstances are fairly probabilistic. The "conventional " unusual (see Becker, Degroot, and (Starmer 2000) for handling this problem is Marschak 1963; Krantz 1967; Rumelhart to claim that inconsistencies are random. and Greeno 1971; Sjoberg 1975, 1977; Though tempting, this strategy fails to Sjoberg and Capozza 1975; Mellers, Chang, explain how and when choice probabilities Birnbaum, and Ordonez 1992). For these will vary across sets. To accomplish the lat- reasons, it can be argued that the principle ter, two classes of probabilistic choice theo- of weak stochastic transitivity should gener- ries were proposed: fixed utility theories ally be retained as a bound of rationality. (e.g., Debreu 1958; Luce 1958) and random Descriptive theories that do not allow viola- utility theories (e.g., Thurstone 1959; tions of weak stochastic transitivity usually Becker, Degroot, and Marschak 1963; perform reasonably well when predicting McFadden 2000). Rieskamp, Busemeyer, and Mellers: Extending the Bounds of Rationality 649

TABLE 4 SUMMARY OF EFFECTS

Principles Emperical Evidence Brief Description of Effects

Strong Stochastic Comparability Effects p(AI{A,B)) > .5, p(BI{B,C}) > .5, and p(AI{A,C}) > .5, Transitivity but p(BI{B,C}) > p(Aj{A,C}) when B and C have similar values on one dimension

Independence Similarity Effects p(BII{Bj,D}) > p(D {B1,D}), from Irrelevant but p(BI|{Bj,D,B2}) p(CI{B,C}) but p(CI{A,B,C}) > p(B|{A,B,C}) when values of C are between values of A and B on all dimensions

Regularity Asymmetric Dominance p(AI{A,B,DI) > p(A {A,B}) when D is dominated by A, but not by B Effects

Attraction Effects p(AI{A,B,D)) > p(AI{A,B}) when D is much inferior to A, but not to B

Weak Stochastic Transitivity Violations p(AI{A,B}) > .5, p(Bj{B,C}) > .5, but p(CI(A,C)) > .5 Transitivity

Fixed utility theories that assume a well- explained by the contrast-weighting model defined preference function imply strong (Mellers and Biagini 1994) and the stochastic stochastic transitivity and the IIA. Again, difference model (Gonzalez-Vallejo 2002). there is a considerable body of empirical evi- But since these models have only been dence, including similarity effects, compara- defined for binary choices, they cannot predict bility effects, and the compromise effect, asymmetric dominance effects, attraction showing violations of these principles. effects, compromise effects, or similarity Random utility theories imply a consistency effects. The most general theories considered, principle of regularity. There is a substantial decision field theory (Busemeyer and body of evidence, including attraction Townsend 1993; Roe, Busemeyer, and effects and asymmetric dominance effects, Townsend 2001) and the leaky, competing against regularity. accumulator model (Usher and McClelland A number of psychological theories have 2004), can explain most effects, except been proposed to accommodate the viola- violations of weak stochastic transitivity. tions. Table 5 summarizes many of the theo- In sum, we have reviewed several theories ries previously mentioned and shows which that can predict, to differing degrees, viola- effects they can and cannot predict. Tversky's tions of the consistency principles. We now (1972b) elimination by aspects theory pro- face the more general question of how to vides an elegant account of similarity effects select a descriptive theory of preferential (violation of independence) but fails to choice. Most theorists agree that two criteria account for asymmetric dominance effects are essential for model selection: pre- and attraction effects (violations of regularity). dictability and parsimony (see Heinz Linhart Tversky and Simonson (1993) developed the and Walter Zucchini 1986; Pitt, Myung, and componential context theory to account for Zhang 2002). The best theory is one that compromise effects but their theory cannot accurately predicts human behavior and is describe similarity effects. Violations of strong simple. Unfortunately, when fitting a model and weak stochastic transitivity can be to the data, accuracy and parsimony usually 650 Journal of Economic Literature, Vol. XLIV (September 2006)

TABLE 5 SUMMARY OF PRINCIPLES, VIOLATIONS, AND THEORETICAL ACCOUNTS

Principles Violations Theoretical Accounts

Consistency Variations in MNL GEV MLM DFT LCA CWM SDM EBA Choices

Strong Comparability GEV MLM DFT LCA CWM SDM EBA ADM Stochastic Effects Transitivity Independence Similarity Effects GEV MLM DFT LCA EBA from Irrelevant Alternatives Compromise GEV MLM DFT LCA CCT Effects

Regularity Asymmetric DFT LCA CCT Dominance Effects

Attraction Effects DFT LCA CCT

Weak Transitivity MLM CWM SDM ADM Stochastic Violations Transitivity where MNL= Multinomial Logit Model, GEV = Generalized Extreme Value Model, MLM = Mixed Logit Model, DFT = Decision Field Theory, LCA = Leaky, Competing Accumulator Model, CWM = Contrast-Weighting Model, SDM = Stochastic Difference Model, EBA = Elimination by Aspects, CCT = Componential Context Theory, and ADM = Additive Difference Model

work in opposition; the more complex different the parameters, whereas decision field model, the better the fit. theory can explain these results using the For choice contexts with equally similar same or set of parameters. In this instance, equally discriminable options, violations decision of field theory provides a simpler and IIA are less common and simple scalablemore parsimonious explanation. Moreover, models should suffice. However, this unlike case isdecision field theory and the leaky, fairly unusual and fails to hold in most competing con- accumulator model, mixed logit sumer choice tasks (e.g., whenever there models are are unable to explain violations of correlated attribute values or unequal the vari- regularity principle. There are many ances in payoffs). If options do vary in consumer simi- choice situations that contain defi- larity or discriminability but all cientof the options (e.g., a product that is extreme- obviously deficient options have ly overpricedbeen because of its brand name), removed, regularity is likely to be satisfiedand violations of regularity would be expect- although IIA could be violated. In this ed. situa- When this occurs, one needs to turn to tion, recent random utility models likemore the complex models that do not require mixed logit model can be effective. regularity.Thus, it In sum, only the more complex can happen that multiple theories canmodels pre- such as the decision field theory, the dict the same violations. For example, leaky both competing accumulator model, or the mixed logit models and decision field mixed theory logit model can explain several of the are capable of explaining similarity behavioreffects. effects that we have reviewed. However, the mixed logit model requiresFuture work will need to focus on direct Rieskamp, Busemeyer, and Mellers: Extending the Bounds of Rationality 651

comparisons of these models to explore how 7.2 When Are Violations of Consistency appropriately the models predict human Principles Reasonable? behavior. Another important principle to consider We have argued that some violations of for model selection is generalizability. The consistency principles are common. Now we goal of model selection is to find a theory address the question of whether these viola- that is capable of predicting not only the tions are reasonable. Rationality is tradition- observed choices in an experiment but also ally defined as compliance with consistency other behaviors that are independent of the principles, and inconsistency implies "irra- particular experiment used for parameter tionality" (Shafir and Robin A. LeBoeuf estimation. When predicting new data, sim- 2002). This approach has been widely chal- pler models are often more robust and lenged (e.g., Aumann 1962; Machina 1982; achieve higher levels of accuracy under new Cohen and Jaffray 1988; Gigerenzer 1991, conditions. This may be particularly impor- 1996a, 1996b; Lopes and Oden 1991; tant for the design of new products. Simpler Hammond 1996; Lopes 1996; McFadden models, built on psychological principles 1999; Mongin 2000). Selten (2001, p. 15) rather than ad hoc parameters, promise to argued that "behavior should not be called be more effective for purposes of generaliza- irrational simply because it fails to conform tion. For example, Tversky's (1972b) elimi- to norms of full rationality" and that "bound- nation by aspects model provides ed rationality is not irrationality." Bounded mechanisms to make a priori predictions rationality is a more accurate description of about the effects of new options. human behavior taking into account that Is there a proper balance between pre- people make decisions with limited time, dictability and complexity? The answer knowledge, and computational power. This depends on the application and the circum- view, initially proposed by Simon (1956, stances. Descriptive theories should be able 1983), recognizes that the lack of compli- to predict violations of consistency princi- ance with consistency principles could, in ples when violations are commonplace. A fact, be functional. For example, the benefit review of the literature suggests that viola- of a quick decision could easily outweigh the tions are frequent for all of the principles cost of some inconsistency. that we examined, except perhaps weak sto- How reasonable are violations of the five chastic transitivity. These violations are consistency principles discussed here? We quite systematic, but relatively rare (cf. will begin with perfect consistency, a princi- Mellers and Biagini 1994). In fact, William ple that follows from deterministic choice H. Morrison (1963) noted that the theo- theories. If individuals had "true" prefer- retical number of intransitive triples in a ences, they might find themselves at a dis- complete set of pairwise comparisons is tinct disadvantage if they deviated from those quite limited: Given n options, the maxi- preferences, especially in the long run. mum number of intransitive triples is However, in dynamic environments, where (n + 1)/(4(n -2)). As n increases, the maxi- individuals are continually faced with new mum number of intransitive triples approach- sets of options, some degree of variability is es one fourth. This result implies that even if beneficial for exploring the environment. A a decision process produces a maximum of competitive environment is another case intransitive choices, the intransitive triples where some degree of variability could be will still be relatively atypical. Therefore, desirable. The decisionmaker whose choices increasing the complexity of a theory to are difficult to forecast has an advantage over account for violations of weak stochastic a predictable opponent. In fact, in zero-sum transitivity appears less justified. games stochastic decisions, representing 652 Journal of Economic Literature, Vol. XLIV (September 2006)

mixed strategies, are the only existing Nash "independent" because they change the con- equilibrium defining optimal behavior. In sequences of the original option (i.e., norms of sum, these real-world advantages of vari- politeness). However, this is exactly why viola- able preferences could easily carry over into tions of independence occur: When people artificial stable laboratory settings. construct their preferences in a choice situa- Numerous studies cast doubt on the idea tion, additional options can change the conse- that people possess "true" preferences in the quences of other options or the perception of sense of fixed utility theories. Instead, the evi- these consequences. dence suggests that people construct or dis- A decision process that constructs or dis- cover their preferences when presented with covers preferences could also explain viola- option sets. The unfolding of these processes tions of regularity. The addition of new depends on the circumstances of the choice options to a choice set (i.e., adding a less situation (i.e., the time pressure, the stakes, attractive option) systematically influences and the way in which preferences are elicit- preferences for options in the original set, ed) and on the options under consideration violating the regularity principle. These types (i.e., the context and the frame). The attrib- of contextual effects need not be unreason- utes an individual uses for comparing options able. When evaluating choice sets, such as might depend on the option set. Their rela- those in table 2, participants in an experiment tive "importance" could depend on the simi- might infer that a dominated option, such as larity of attribute values, and thus explain D, was included in the choice set to provide comparability effects. Focusing the individ- useful information (Paul H. Grice 1975, ual's attention on attributes with different val- 1989; see also Denis J. Hilton 1995). ues seems like an efficient and reasonable Participants might infer that laptop comput- form of decision making. ers with large displays, such as A, are relative- Violations of IIA could also be reasonable. ly uncommon and perhaps less desirable than For instance, Greene (2000, p. 864) in his laptop computers with smaller displays, such standard textbook in writes as B and D. If the market for larger displays is that the principle "is not a particularly small, B could be the best choice. appealing restriction to place on consumer Birger Wernerfelt (1995) made a similar behavior." David Brownstone and Train argument by suggesting that people use the (1999, p. 110) argue that the substitutions available options to draw inferences about effects that are implied by the IIA principle their own tastes and utilities. Decisionmakers "can be unrealistic in many settings" who and know their relative but not their McFadden (2001, p. 358) argues that absolute they utilities could make assumptions "may not be behaviorally plausible." about the correct choice from the choice set Likewise, Hensher and Greene (2003, if they p. believed the options reflected popula- 135) regard the principle as "questionable." tion utilities. They would use the market The famous "red-bus blue-bus" example offerings to assign utilities and infer their described above shows that in many ownsitua- preferences. Drazen Prelec, Wernerfelt, tions it appears unreasonable to follow and the Florian Zettelmeyer (1997) developed a IIA principle. (1993) presented context model that reflects simple infer- another illustrative example of a decision ences as to about one's own preferences (ideal whether to take an apple out of a fruit basketpoints) given what is available (product at a dinner party. A person who behaves "addresses"). They generate predictions that decently might not take the apple if it iscan the simulate compromise and attraction last. But, if more than one apple is left effects.in the Though promising, their model basket, the same person might take an apple. requires additional information (category Of course, the additional apples are not reallyratings) that reflect a decisionmaker's ideal Rieskamp, Busemeyer, and Mellers: Extending the Bounds of Rationality 653 point and his or her perceptions of product violates the axiom may be "taken" by a addresses. pump (see Davidson, John Charles In sum, the different violations of consis- C. McKinsey, and Suppes 1955; Robin P. tency principles can be justifiable within a Cubitt and Sugden 2001). Being "taken" "bounded rationality" framework. Bounded means that a decisionmaker who starts with rationality stresses how choices depend on an option and cycles through a series of the characteristics of the environment. It also choices, paying each time for a new option, stresses how the selection of simple strate- will eventually end up with less money and gies can be quite adaptive. For example, the original option. It is unquestionable that Gigerenzer and Daniel G. Goldstein (1996) such intransitive choices are costly and demonstrated that a simple lexicographic unreasonable. However, we do not provided accurate predictions in a encounter money pumps very often and, if paired-comparison inference task. If a we did, we would probably learn to avoid heuristic is fairly good, it might be beneficial, them. As experimental results show (Yun- even if it occasionally produces intransitive Peng Chu and Ruey-Ling Chu 1990), build- choices. Consistent, Rieskamp and Otto ing a money pump is not easy because (2006) could show that simple can people typically recognize their intransitive predict individuals' choices quite well. choices and quickly change their behavior. Warren Thorngate (1980) and Payne, Thus, money pump "arguments are logical Bettman, and Johnson (1988) demonstrated bogeymen" (Lopes 1996, p. 187) that that simple heuristics can also be efficient demonstrate how irrational behavior in when choosing among options with a flat principle could occur, but that do not show maximum. Another heuristic that can lead to that irrational behavior in fact occurs. intransitivities is the majority rule (e.g., The regular, systematic, and robust viola- Kenneth O. May 1954; Maya Bar-Hillel and tions of consistency set the requirements for Avishai Margalit 1988). With this rule, the descriptive theories of choice. The violations decisionmaker simply counts the number of we have discussed are interrelated in subtle dimensions where an option has an advan- and interesting ways; they illustrate system- tage (disregarding the magnitude of the atic behavioral principles. Whether they advantages) and selects the option with the represent rational or irrational behavior greatest number of advantages. By focusing might be analogous to asking whether the only on the relative advantage of an option glass is half empty or half full. What really compared to alternative options, the majority matters is (1) understanding how and why rule can violate transitivity. Nonetheless, people make choices and (2) being able to these heuristics, such as the lexicographic or predict those choices. Debates about ration- majority rule, require only a small subset of ality focus attention far too narrowly. A the available information, which is then easy broader conversation-one that considers to process. Heuristics can be regarded as reasonable behavior, adaptive behavior, and "approximation methods" of more complex the environment in which choices occur-is strategies, and by "using such methods.., we long overdue. We look forward to this shift implicitly assume that the world is not in focus and the related evidence and theo- designed to take advantage of our approxi- ries that will unfold in the next several mation methods," whereas experiments are decades. often "designed with exactly that goal in APPENDICES mind" (Tversky 1969, p. 46). The traditional argument that transitivity Appendix A: Proof that the elimination by is a cornerstone of rationality is based on aspect theory cannot explain compromise the observation that a decisionmaker who effects. 654 Journal of Economic Literature, Vol. XLIV (September 2006)

Consider a choice set consisting of three effect (see Tversky 1972b). For binary cameras labeled A, B, and C. A is a high- choices, EBA predicts: quality camera with many high-quality fea- (A2) p(A|{A,B})= tures and is very expensive; B is a low-quality camera with no high-quality features and is quite cheap; C is between the two extremes, u(a) having some, but not all, of the high-quality The compromise effect implies that the features and is moderately priced. The com- choice probabilities for the three binary promise camera shares some features with choices are equal so that: both extreme options. With the compromise (A3) u(a) + u(ac) = u(b) + u(bc), effect, p(AI{A,B}) = p(A|{A,C}) = p(BI{B,C}). But, when the three options are evaluated (A4) u(a) + u(ab) = u(c) + u(bc), and together, p(C|{A,B,C}) > p(A {A,B,C}) and (A5) u(b) + u(ab) = u(c) + u(ac). p(CI{A,B,C}) > p(B|{A,B,C}). The compo- nential context theory (Tversky and From (Al) and (A5) it is known that Simonson 1993) predicts the compromise u(b) > u(c) and from (Al) and (A4) it is effect. In contrast, the elimination by aspects known that u(a) > u(c). For triadic choices, theory (Tversky 1972b) cannot predict the EBA predicts effect, as shown below. In the elimination by aspects theory, alternatives are composed of p(AI{A,B,C}) = [u(a) + u(ab) . p(A|{A,B}) + u(ac).p(AI{A,C})]/u(K), common and unique positive aspects: a = unique aspects of A, where u(K) = [u(a) + u(b) + u(c) + u(ab) + b = unique aspects of B, u(ac) + u(bc)]. The compromise effect c = unique aspects of C, implies ab= common aspects between A and B, p(CI{A,B,C}) > p(A {A,B,C}) or but not with C, [u(c) + .5(u(ac) + u(bc))]/u(K) > ac= common aspects between A and C, [u(a) + .5(u(ab) + u(ac))]/u(K), but not with B, and bc= common aspects between B and C, p(CI{A,B,C}) > p(BJ{A,B,C}) or but not with A. [u(c) + .5(u(ac) + u(bc))]/u(K) > Let u be a scale that assigns a positive [u(b) + .5(u(ab) + u(bc)}]/u(K). value to these aspects so that, for instance, If both sides of the inequalities are multi- u(a) represents the positive value of aspects plied by 2 and u(K) is cancelled out, the a to the decisionmaker. Cameras A and C result is: share some high-quality features, whereas cameras A and B have little in common so (A6) [u(c) + u(bc) + u(c) + u(ac)] > that u(ac) > u(ab). Cameras B and C share [u(a) + u(ac) + u(a) + u(ab)], the positive aspect of not being very expen- (A7) [u(c) + u(bc) + u(c) + u(ac)] > sive, which cameras A and C do not share so that u(bc) > u(ab). Thus, on the basis of the [u(b) + u(bc) + u(b) + u(ab)]. similarity relations among the choice With the substitutions from (A3) and (A4), options, it can be assumed that: inequality (A7) can be rewritten as (Al) u(ac) > u(ab) and u(bc) > u(ab). [u(a) +u(ab) +u(c) +u(ac)] > [u(a) + u(ac) + u(b) + u(ab)]. These assumptions, regarding the mapping from options to similarity relations, are essen- After canceling on both sides, it is found tial for EBA to account for the similarity that u(c) > u(b), contradicting the earlier Rieskamp, Busemeyer, and Mellers: Extending the Bounds of Rationality 655

conclusion that u(b) > u(c) from (Al) and (B2) V(AI{A,B,D}) = (A5). Similarly, with the substitutions from kPkVk(A) + O[R(A,B) + R(A,D)] (A4), inequality (A6) can be rewritten as

[u(a) + u(ab) + u(c) + u(ac)] > where Y'kfkvk(A) = v(A) represents the value [u(a) + u(ac) + u(a) + u(ab)]. v of option A neglecting advantages of other options and: After canceling on both sides, it is found that u(c) > u(a), contradicting the earlier conclusion that u(a) > u(c) from (Al) and (B3) (A4). Therefore, EBA cannot predict the compromise effect. ikadvk(A,B) Appendix B: Proof that the componential context theory cannot explain similarity where advk(A,B) = Vk(A) - Vk(B) if A has a effects. positive advantage over B, and zero other- wise, and disk(A,B) = 6[vk(B) -Vk(A)] if Consider a choice set consisting of three option A has a disadvantage over B, and zero cameras labeled A, B, and D. A is a high-qual- otherwise, and where 6 is a convex function ity camera with many high-quality features consistent with loss aversion so that the and is expensive; B is a low-quality camera value for the disadvantage of A over B is with no extra features and is very cheap. larger than the corresponding advantage of Camera D is a slightly higher-quality camera B over A. Convexity is required to explain than A; it has more quality features and a the compromise effect (Tversky and slightly higher price than A. D is very similar Simonson 1993). to A, and both A and D are quite different The fact that the binary choices are all from B. The similarity effect implies that equal implies: adv(B,D)-=adv(D,B) and p(AI{A,B}) = p(AI{A,D}) = p(BI{B,D}). But, v(B) = v(D); adv(B,A) = adv(A,B) and v(A) when the three options are evaluated togeth- = v(B); adv(D,A) = adv(A,D) and v(A) er, either p(BI{A,B,D}) > p(AI{A,B,D}) or = v(D). p(BI{A,B,D}) > p(DI{A,B,D}). The proof that For the similarity effect, as described the componential context theory cannot pre- above, the largest advantage on price occurs dict the similarity effect can be demonstrated between B and D, the next largest advantage with the first of the two triadic choices. occurs between B and A, and the smallest The componential context theory is based advantage occurs between A and D. That is: on the idea that people consider the advan- (B4) adv(B,D) > adv(B,A) > adv(A,D). tages of one option over those of another. Camera A has a higher-quality advantage To explain the similarity effect we must over B and camera B has a higher-price obtain advantage over A. The quality advantage of (B5) V(BI{A,B,D}) > V(A|{A,B,D1) A over B is denoted adv(A,B) and the price and because advantage of B over A is denoted adv(B, A). For triadic choice sets, the model assumes:

(BI) p(AI{A,B,D}) = F[V(AI{A,B,D}), adv(B,A) V(BI{A,B,D}), V(D|{A,B,D})] where F is an increasing function of the first adv(B,A) argument and a decreasing function of the second two. Furthermore: (B5) is true when 656 Journal of Economic Literature, Vol. XLIV (September 2006)

Assume that Y is a set of options adv(B,D) Y = {A1,...,A,,}, for instance a number of college applicants following Tversky's (1969) example. Each candidate i is adv(B,D) described by three attribute values Xik, intel- lectual stability (k = 1), emotional stability (k = 2), and social facility (k = 3). Suppose, adv(A,D) the first candidate has the attribute values But if adv(B,D) > adv(A,D), as in (B4), x1l this= 66, x12= 90, and x13= 85, the second equation must be false for convex 3 candidatefunc- has the values x21= 78, x22= 70, tions. Thus, the componential context and model x,2 = 55, and the third candidate has the cannot predict the similarity effect. values xa31= 90, x32= 32, and xa33= 4. We can apply the mixed logit model as defined by equation 7, such that the choice probability Appendix C: Proof that mixed logit models when choosing between the first and second satisfy regularity. candidate is Assume that X is a complete set of options and Y is a subset of X(Y C X), that is Y= {A1,...,Am} and X= {A1,...,An} withp(A1I n m. Define L(Ai Y; /) as the probability that option Ai is chosen from a set Y given For a convenience, define vq = exp(kfkq" 'Xik). fixed set of coefficients P3. For example, Following L the logic of the mixed logit model may be defined by a mixed logit model assume with the vector of weights, is selected random coefficients P (see for example, with equal probability of 1/3 from the set McFadden and Train, 2000). Assume Q= that :{(11-= -1, 321= 3.2, P,31= -2.6), (312= 1, L(Ai Y;1) satisfies regularity. In other words, P22=-3.025, 1P32=2.45), ( A -3=0.2, 123= given Y C X, it is assumed that L(Ai -0.595,Y;13) A13=0.51)}. - Then, the probability of L(AiJX; ) > 0 for every 13. Then it follows choosing the first candidate when comparing the first and second candidate is: that p(AiIY) is given by a probability mixture of L(AilY;fl) as defined by the integral over p(A1{A1,A2}) = [v11/(v11 +V21) + the densityf. v12A/(v12+v22) +v13 /(V13 +V23)]/3 = [el/(el+ e3) + e2/(e2 + e') + p(AjIY) = JL(Ai Y;P) . f(P)dP. e3/(e3 + e2)]/3 = 0.53.

Then the difference Correspondingly, the probability of choosing the second candidate when comparing the p(AjiY) - p(AIX) = second and third candidate can be deter- JL(AiIY;1) -f(/)d - jL(A IX; P) -f(P)dP= mined, so that p(A21 {A2,A3}) = 0.53 results. Finally, when comparing the first and the J[L(Ai IY;1P) - L(AijX;fP)] . f(f)dp > 0 third candidate the probability p(A11 {A1,A3}) = .47 results, which violates is always positive, which implies that weak stochastic transitivity. p(AI|Y) > p(AijX). Therefore the mixed logit model satisfies regularity. Appendix E: Proof that the generalized extreme value model satisfies weak stochastic transitivity. Appendix D: Example of how the mixed logit model can explain violations of weak Assume that Y is a set of options stochastic transitivity. Y = {A1,...,Am}. The generalized extreme Rieskamp, Busemeyer, and Mellers: Extending the Bounds of Rationality 657 value model defines the binary choice Birnbaum, Michael H., Jamie N. Patton, and Melissa K. Lott. 1999. "Evidence Against Rank-Dependent probability as follows: Utility Theories: Tests of Cumulative Independence, Interval Independence, Stochastic Dominance, and p(AiJ{Ai,Aj}) Transitivity." Organizational Behavior and Human Decision Processes, 77(1): 44-83. Bock, Richard D., and Lyle V. Jones. 1968. The exp Measurement and Prediction of Judgment and Choice. San Francisco: Holden-Day. where X is a vector of Brownstone, attributes, David and Kenneth P3 E. Train.is a1999. vec- tor of weights assigned "Forecasting to the New Productattributes, Penetration with Flexibleand 0 is as free similarity parameter Substitution Patterns." thatJournal of depends Econometrics, 89(1-2): 109-29. on the similarity between the pair of options Budescu, David V., and Wendy Weiss. 1987. "Reflection for each choice set and that can account for of Transitive and Intransitive Preferences: A Test of violations of IIA. Note that ." Organizational Behavior and Human Decision Processes, 39(2): 184-202. p(Ai {Ai,Aj1}) > .50 -- p(Aij {Ai,Aj)/p(AjI {Ai,Aj}) Busemeyer, Jerome R. 1985. "Decision Making Under Uncertainty: A Comparison of Simple Scalability, Fixed-Sample, and Sequential-Sampling Models." Journal of : Learning, =Sexp(f3Xi/O) Memory, and Cognition, 11(3): 538-64. Therefore if Busemeyer, Jerome R., and Adele Diederich. 2002. "Survey of Decision Field Theory." Mathematical p(AiI{A,,Ai1) > .50 and p(AjI{Aj,Ak }) > .50, Social Sciences, 43(3): 345-70. then Busemeyer, Jerome R., and James T. Townsend. 1993. "Decision Field Theory: A Dynamic-Cognitive OXi> OX, and fPXj, > -> fPXk PX, > fPXk > -> Approach to Decision Making in an Uncertain Environment." , 100(3): 432-59. p(AiI{A,,Ak}) > .50. Busemeyer, Jerome R., Ethan Weg, Rachel Barkan, Xuyang Li, and Zhengping Ma. 2000. "Dynamic and REFERENCES Consequential Consistency of Choices Between Paths of Decision Trees." Journal of Experimental Ariely, Dan, and Thomas S. Wallsten. 1995. "Seeking Psychology: General, 129(4): 530-45. Subjective Dominance in Multidimensional Camerer,Space: Colin F. 1989. "An Experimental Test of An Exploration of the Asymmetric Dominance Several Generalized Utility Theories." Journal of Effect." Organizational Behavior and Human Risk and Uncertainty, 2(1): 61-104. Decision Processes, 63(3): 223-32. Chu, Yun-Peng, and Ruey-Ling Chu. 1990. "The Aumann, Robert J. 1962. "Utility Theory Without Subsidence the of Preference Reversals in Simplified Completeness Axiom." , 30(3): 445-62. and Marketlike Experimental Settings: A Note." Ballinger, T. Parker, and Nathaniel T. Wilcox. 1997.American Economic Review, 80(4): 902-11. "Decisions, Error and Heterogeneity." Economic Cohen, Michele, and Jean-Yves Jaffray. 1988. "Is Journal, 107(443): 1090-1105. Savage's Independence Axiom a Universal Rationality Bar-Hillel, Maya, and Avishai Margalit. 1988. "How Principle?" Behavioral Science, 33(1): 38-47. Vicious are Cycles of Intransitive Choice?" Theory Coombs, Clyde H. 1958. "On the Use of Inconsistency and Decision, 24(2): 119-45. of Preferences in Psychological Measurement." Becker, Gordon, Morris H. Degroot, and Jacob Journal of Experimental Psychology, 55(1): 1-7. Marschak. 1963. "Probabilities of Choices Among Coombs, Clyde H., and Dean G. Pruitt. 1961. "Some Very Similar Objects: An Experiment to Decide Characteristics of Choice Behavior in Risky Between Two Models." Behavioral Science, 8(4):Situations." Annals of the New York Academy of 306-11. Sciences, 89(5): 784-94. Bell, David E. 1982. "Regret in Decision Making Cubitt, Robin P., and Robert Sugden. 2001. "On Under Uncertainty." Operations Research, 30(5): Money Pumps." Games and Economic Behavior, 961-81. 37(1): 121-60. Berry, Steven T. 1994. "Estimating Discrete-Choice Davidson, Donald, and Jacob Marschak. 1959. Models of Product Differentiation." RAND Journal "Experimental Tests of a Stochastic Decision of Economics, 25(2): 242-62. Theory." In Measurement: Definitions and Theories, Berry, Steven T., James Levinsohn, and Ariel Pakes. ed. West C. Churchman and Philburn Ratoosh. New 1995. "Automobile Prices in Market Equilibrium." York: Wiley, 233-69. Econometrica, 63(4): 841-90. Davidson, Donald, John Charles C. McKinsey, and Berry, Steven, James Levinsohn, and Ariel Pakes. 1999. Patrick Suppes. 1955. "Outlines of a Formal Theory "Voluntary Export Restraints on Automobiles: of Value, I." Philosophy of Science, 22(2): 140-60. Evaluating a Trade Policy." American Economic Davis, John M. 1958. "The Transitivity of Preferences." Review, 89(3): 400--430. Behavioral Science, 3(1): 26-33. 658 Journal of Economic Literature, Vol. XLIV (September 2006)

Debreu, Gerard. 1958. "Stochastic Choice and American Journal of Psychology, 75(1): 35-44. Cardinal Utility." Econometrica, 26(3): 440-44. Hammond, Kenneth R. 1996. Human Judgment and Debreu, Gerard. 1960. "Review of R. D. Luce, Social Policy. New York: Oxford University Press. Individual Choice Behavior: A Theoretical Analysis." Harbaugh, William T., Kate Krause, and Timothy R. American Economic Review, 50(1): 186-88. Berry. 2001. "GARP for Kids: On the Development Dhar, Ravi, Stephen M. Nowlis, and Steven J. of Rational Choice Behavior." American Economic Sherman. 2000. "Trying Hard or Hardly Trying: An Review, 91(5): 1539-45. Analysis of Context Effects in Choice." Journal of Harless, David W, and Colin F. Camerer. 1994. "The Consumer Psychology, 9(4): 189-200. Predictive Utility of Generalized Expected Utility Domencich, Tom, and Daniel L. McFadden. 1975. Theories." Econometrica, 62(6): 1251-89. Urban Travel Demand: A Behavioral Analysis. Hausman, Jerry A., and David A. Wise. 1978. "A Amsterdam: North-Holland. Conditional Probit Model for Qualitative Choice: Edwards, Ward. 1954. "The Theory of Decision Discrete Decisions Recognizing Interdependence Making." Psychological Bulletin, 51(4): 380-417. and Heterogeneous Preferences." Econometrica, Edwards, Ward. 1961. "Behavioral ." 46(2): 403-26. Annual Review of Psychology, 12: 473-98. Heath, Timothy B., and Subimal Chatterjee. 1995. Falmagne, Jean Claude, and Geoffrey Iverson. 1979. "Asymmetric Decoy Effects on Lower-Quality versus "Conjoint Weber Laws and Additivity." Journal of Higher-Quality Brands: Meta-Analytic and , 20(2): 164-83. Experimental Evidence." Journal of Consumer Fishburn, Peter C. 1982. "Nontransitive Measurable Research, 22(3): 268-84. Utility." Journal of Mathematical Psychology, 26(1): Hensher, David A. 1986. "Dimensions of Automobile 31-67. Demand: An Overview of an Australian Research Gigerenzer, Gerd. 1991. "How to Make Cognitive Project." Environment and Planning A, 18(10): Illusions Disappear: Beyond 'Heuristics and Biases'." 1339-74. In European Review of , Vol. 2, ed. Hensher, David A., and William H. Greene. 2003. "The Wolfgang Stroebe and Miles Hewstone. New York: Mixed Logit Model: The State of Practice." Wiley, 83-115. Transportation, 30(2): 133-76. Gigerenzer, Gerd. 1996a. "Rationality: Why Social Hey, John D. 2001. "Does Repetition Improve Context Matters." In Interactive Minds: Life-Span Consistency?" , 4(1): 5-54. Perspectives on the Social Foundation of Cognition, Hey, John D., and Chris Orme. 1994. "Investigating ed. Paul B. Baltes and Ursula M. Staudinger. Generalizations of Expected Utility Theory Using Cambridge: Cambridge University Press, 319-46. Experimental Data." Econometrica, 62(6): 1291-1326. Gigerenzer, Gerd. 1996b. "On Narrow Norms and Highhouse, Scott. 1996. "Context-Dependent Vague Heuristics: A Reply to Kahneman and Selection: The Effects of Decoy and Phantom Job Tversky." Psychological Review, 103(3): 592-96. Candidates." Organizational Behavior and Human Gigerenzer, Gerd, and Daniel G. Goldstein. 1996. Decision Processes, 65(1): 68-76. "Reasoning the Fast and Frugal Way: Models of Hilton, Denis J. 1995. "The Social Context of Bounded Rationality." Psychological Review, 103(4): Reasoning: Conversational Inference and Rational 650-69. Judgment." Psychological Bulletin, 118(2): 248-71. Gigerenzer, Gerd, and Reinhard Selten, eds. 2001. Hong, Chew Soo, Edi Karni, and Zvi Safra. 1987. "Risk Bounded Rationality: The Adaptive Toolbox. Aversion in the Theory of Expected Utility with Rank Cambridge: MIT Press. Dependent Probabilities." Journal of Economic Gonzalez-Vallejo, Claudia. 2002. "Making Trade-Offs: A Theory, 42(2): 370-81. Probabilistic and Context-Sensitive Model of Choice Huber, Joel, John W. Payne, and Christopher Puto. Behavior." Psychological Review, 109(1): 137-55. 1982. "Adding Asymmetrically Dominated Alterna- Green, Jerry R., and Bruno Jullien. 1989. "Ordinal tives: Violations of Regularity and the Similarity Independence in Nonlinear Utility Theory: Hypothesis." Journal of Consumer Research, 9(1): Erratum." Journal of Risk and Uncertainty, 2(1): 90-98. 119. Huber, Joel, and Christopher Puto. 1983. "Market Greene, William H. 2000. Econometric Analysis. Boundaries and Product Choice: Illustrating Attrac- Upper Saddle River, N.J.: Prentice-Hall. tion and Substitution Effects." Journal of Consumer Grether, David M., and Charles R. Plott. 1979. Research, 10(1): 31-44. "Economic Theory of Choice and the Preference Keeney, Ralph L., and Howard Raiffa. 1976. Decisions Reversal Phenomenon." American Economic with Multiple Objectives: Preferences and Value Review, 69(4): 623-38. Tradeoffs. New York: Cambridge University Press. Grice, Paul H. 1975. "Logic and Conversation." Krantz, In David H. 1967. "Rational Distance Functions Syntax and Semantics 3: Speech Acts, ed. Peter forCole Multidimensional Scaling." Journal of and Jerry L. Morgan. New York: Academic Press, Mathematical Psychology, 4(2): 226-45. 43-58. Leland, Jonathan W. 1994. "Generalized Similarity Grice, Paul H. 1989. Studies in the Way of Words. Judgments: An Alternative Explanation for Choice Cambridge: Press. Anomalies." Journal of Risk and Uncertainty, 9(2): Griswold, Betty J., and R. Duncan Luce. 1962. 151-72. "Choices among Uncertain Outcomes: A Test of a Lindman, Harold R., and James Lyons. 1978. Decomposition and Two Assumptions of Transitivity." "Stimulus Complexity and Choice Inconsistency Rieskamp, Busemeyer, and Mellers: Extending the Bounds of Rationality 659

Among Gambles." Organizational Behavior and Techniques." Mathematical Social Sciences, 23(1): Human Decision Processes, 21(2): 146-59. 5-29. Linhart, Heinz, and Walter Zucchini. 1986. Model Marley, Anthony Alfred J., and Hans Colonius. 1992. Selection. New York: Wiley. "The 'Horse Race' Random Utility Model for Choice Loomes, Graham, and Robert Sugden. 1982. "Regret Probabilities and Reaction Times, and its Competing Theory: An Alternative Theory of Rational Choice Interpretation." Journal of Mathematical under Uncertainty." Economic Journal, 92(368): Psychology, 36(1): 1-20. 805-24. May, Kenneth O. 1954. "Transitivity, Utility, and Loomes, Graham, and Robert Sugden. 1995. Aggregation in Preferences Patterns." Econometrica, "Incorporating a Stochastic Element into Decision 22(1): 1-13. Theories." European Economic Review, 39(3-4): McFadden, Daniel. 1974. "Conditional Logit Analysis 641-48. of Qualitative Choice Behavior." In Frontiers in Lopes, Lola L. 1996. "When Time Is of the Essence: Econometrics, ed. Paul Zarembka. New York: Averaging, Aspiration, and the Short Run." Academic Press, 105-42. Organizational Behavior and Human Decision McFadden, Daniel. 1978. "Modeling the Choice of Processes, 65(3): 179-89. Residential Location." In Spatial Interaction Theory Lopes, Lola L., and Gregg C. Oden. 1991. "The and Planning Models, ed. Anders Karlqvist, Lars Rationality of ." In Probability and Lundqvist, Folke Snickars, and Jorgen W Weibull. Rationality: Studies on L. Jonathan Cohen's New York: North-Holland, 75-96. Philosophy of Science, ed. Ellery Eels and Tomasz McFadden, Daniel. 1981. "Econometric Models of Maruszewski. Amsterdam: Editions Rodopi, Probabilistic Choice." In Structural Analysis of 199-223. Discrete Data with Economic Applications, ed. Luce, R. Duncan. 1956. "Semiorders and a Theory of Charles F. Manski and Daniel McFadden. Utility Discrimination." Econometrica, 24(2): 178-91. Cambridge: MIT Press, 198-272. Luce, R. Duncan. 1958. "A Probabilistic Theory McFadden,of Daniel. 1999. "Rationality for Economists?" Choice." Econometrica, 26(2): 193-224. Journal of Risk and Uncertainty, 19(1-3): 73-105. Luce, R. Duncan. 1959. Individual Choice Behavior McFadden, Daniel. 2001. "Economic Choices. New York: Wiley. American Economic Review, 91(3): 351-78. Luce, R. Duncan. 1990. "Rational Versus Plausible McFadden, Daniel, and Kenneth E. Train. 2000. Accounting Equivalences in Preference Judgments." "Mixed MNL Models for Discrete Response." Psychological Science, 1(4): 225-34. Journal of Applied Econometrics, 15(5): 447-70. Luce, R. Duncan. 2000. Utility of Gains and Losses: McLean, lain. 1995. "Independence of Irrelevant Measurement-Theoretical and Experimental Alternatives before Arrow." Mathematical Social Approaches. Mahwah, N.J.: Lawrence Erlbaum Sciences, 30(2): 107-26. Associates. Mellers, Barbara A., and Karen Biagini. 1994. Luce, R. Duncan, Barbara A. Mellers, and Shi-jie "Similarity and Choice." Psychological Review, Chang. 1993. "Is Choice the Correct Primitive? On 101(3): 505-18. Using Certainty Equivalents and Reference Levels Mellers, Barbara A., Shi-Jie Chang, Michael H. to Predict Choices among Gambles." Journal of Risk Birnbaum, and Lisa D. Ordonez. 1992. and Uncertainty, 6(2): 115-43. "Preferences, Prices, and Ratings in Risky Decision Luce, R. Duncan, and Patrick Suppes. 1965. Making." Journal of Experimental Psychology: "Preference, Utility, and Subjective Probability." In Human Perception and Performance, 18(2): 347-61. Handbook of Mathematical Psychology, Vol. 3, ed. R.Meyer, Robert J., and Barbara E. Kahn. 1991. Duncan Luce, Robert B. Bush, and Eugene "Probabilistic Models of Consumer Choice Galanter. New York: Wiley, 249-410. Behavior." In Handbook of Consumer Behavior, ed. Luce, R. Duncan, and Detlof von Winterfeldt. 1994. Thomas S. Robertson and Harold R. Kassarjian. "What Common Ground Exists for Descriptive, Upper Saddle River, N.J.: Prentice Hall, 85-123. Prescriptive, and Normative Utility Theories?" Mishra, Sanjay, U.N. Umesh, and Donald E. Stem. Management Science, 40(2): 263-79. 1993. "Antecedents of the Attraction Effect: An MacCrimmon, Kenneth R. 1968. "Descriptive and Information-Processing Approach." Journal of Normative Implications of the Decision-Theory Marketing Research, 30(3): 331-49. Postulates." In Risk and Uncertainty, ed. Karl Borch Mongin, Philippe. 2000. "Does Optimization Imply and Jan Mossin. New York: St. Martin's Press, 3-23. Rationality?" Synthese, 124(1): 73-111. Machina, Mark J. 1982. "'Expected Utility' Analysis Montgomery, Henry. 1977. "A Study of Intransitive without the Independence Axiom." Econometrica, Preferences Using a Think-Aloud Procedure." In 50(2): 277-323. Decision-Making and Change in Human Affairs, ed. Machina, Mark J. 1985. "Stochastic Choice Functions Helmut Jungermann and Gerard de Zeeuw. Generated from Deterministic Preferences over Dordrecht: Reidel, 347-64. Lotteries." Economic Journal, 95(379): 575-94. Morrison, William H. 1963. "Testable Conditions for Manski, Charles F. 1977. "Structure of Random Utility Triads of Paired Comparison Choices." Models." Theory and Decision, 8(3): 229-54. Psychometrika, 28(4): 369-90. Marley, Anthony Alfred J. 1992. "A Selective Review of Mosteller, Frederick, and Philip Nogee. 1951. "An Recent Characterizations of Stochastic Choice Experimental Measurement of Utility." Journal of Models Using Distribution and Functional Equation Political Economy, 59(5): 371-404. 660 Journal of Economic Literature, Vol. XLIV (September 2006) von Neumann, John, and Oskar Morgenstern. 1947. of the Luce and Restle Choice Models." Journal of Theory of Games and Economic Behavior Princeton, Mathematical Psychology, 8(3): 370-81. N.J.: Press. Samuelson, Paul. 1953. Foundations of Economic Pan, Yigang, and Donald R. Lehmann. 1993. "The Analysis. Cambridge: Harvard University Press. Influence of New Brand Entry on Subjective Brand Savage, Leonard J. 1954. The Foundations of Statistics. Judgments." Journal of Consumer Research, 20(1): New York: Wiley. 76-86. Schmidt, Peter, and Robert P. Strauss. 1975. Pan, Yigang, Sue O'Curry, and Robert Pitts. 1995. "The "Estimation of Models with Jointly Dependent Attraction Effect and Political Choice in Two Qualitative Variables: A Simultaneous Logit Elections." Journal of Consumer Psychology, 4(1): Approach." Econometrica, 43(4): 745-55. 85-101. Schoemaker, Paul J. H. 1982. "The Expected Utility Payne, John W., James R. Bettman, and Eric J. Johnson. Model: Its Variants, Purposes, Evidence and 1988. "Adaptive Strategy Selection in Decision Limitations." Journal of Economic Literature, 20(2): Making." Journal of Experimental Psychology: 529-63. Learning, Memory, and Cognition, 14(3): 534-52. Selten, Reinhard. 1991. "Properties of a Measure of Payne, John W., James R. Bettman, and Eric J. Predictive Success." Mathematical Social Sciences, Johnson. 1993. The Adaptive Decisionmaker. 21(2): 153-67. Cambridge: Cambridge University Press. Selten, Reinhard. 2001. "What Is Bounded Pitt, Mark A., In Jae Myung, and Shaobo Zhang. 2002. Rationality?" In Bounded Rationality: The Adaptive "Toward a Method of Selecting Among Toolbox, ed. and Reinhard Selten. Computational Models of Cognition." Psychological Cambridge: MIT Press, 13-36. Review, 109(3): 472-91. Sen, Amartya. 1993. "Internal Consistency of Choice." Prelec, Drazen, Birger Wernerfelt, and Florian Econometrica, 61(3): 495-521. Zettelmeyer. 1997. "The Role of Inference in Shafir, Eldar B., and Robyn A. Leboeuf. 2002. Context Effects: Inferring What You Want from "Rationality." Annual Review of Psychology, 53(1): What Is Available." Journal of Consumer Research, 491-517. 24(1): 118-25. Shafir, Eldar B., Daniel N. Osherson, and Edward E. Quiggin, John. 1982. "A Theory of Anticipated Utility." Smith. 1993. "The Advantage Model: A Comparative Journal of Economic Behavior and Organization, Theory of Evaluation and Choice Under Risk." 3(4): 323-43. Organizational Behavior and Human Decision Radner, Roy, and Jacob Marschak. 1954. "Note on Processes, 55(3): 325-78. Some Proposed Decision Criteria." In Decision Simon, Herbert A. 1956. "Rational Choice and the Processes, ed. Robert M. Thrall, Clyde H. Coombs, Structure of the Environment." Psychological and Robert L. Davis. New York: Wiley, 61-68. Review, 63(2): 129-38. Ranyard, Robert H. 1977. "Risky Decisions Which Simon, Herbert A. 1983. "Alternative Visions of Violate Transitivity and Double Cancellation." Acta Rationality." In Reason in Human Affairs, ed. Herbert Psychologica, 41(6): 449-59. A. Simon. Stanford: Press, 7-35. Revelt, David, and Kenneth E. Train. 1998. "Mixed Simonson, Itamar. 1989. "Choice Based on Reasons: Logit with Repeated Choices: ' Choices The Case of Attraction and Compromise Effects." of Appliance Efficiency Level." Review of Economics Journal of Consumer Research, 16(2): 158-74. and Statistics, 80(4): 647-57. Sjoberg, Lennart. 1975. "Uncertainty of Comparative Rieskamp, J6rg, and Ulrich Hoffrage. 1999. "When Do Judgments and Multidimensional Structure." People Use Simple Heuristics, and How Can We Multivariate Behavioral Research, 10(2): 207-18. Tell?" In Simple Heuristics That Make Us Smart, ed. Sjoberg, Lennart. 1977. "Choice Frequency and Gerd Gigerenzer, Peter M. Todd, and the ABC Similarity." Scandinavian Journal of Psychology, Research Group. New York: Oxford University 18(2): 103-15. Press, 141-67. Sjoberg, Lennart, and Dora Capozza. 1975. Rieskamp, J6rg, and Philipp E. Otto. 2006. "SSL: A "Preference and Cognitive Structures of Italian Theory of How People Learn to Select Strategies." Political Parties." Giornale Italiano di Psicologia, Journal of Experimental Psychology: General, 2(3): 391-402. 135(2): 207-36. Slovic, Paul, and Sarah Lichtenstein. 1983. "Preference Roe, Robert M., Jerome R. Busemeyer, and James T. Reversals: A Broader Perspective." American Townsend. 2001. "Multialternative Decision Field Economic Review, 73(4): 596-605. Theory: A Dynamic Connectionist Model of Decision Starmer, Chris. 2000. "Developments in Non-Expected Making." Psychological Review, 108(2): 370-92. Utility Theory: The Hunt for a Descriptive Theory of Roelofsma, Peter, and Daniel Read. 2000. "Intransitive Choice under Risk." Journal of Economic Literature, Intertemporal Choice." Journal of Behavioral 38(2): 332-82. Decision Making, 13(2): 161-77. Stern, Eliahu, and Harry W. Richardson. 2005. Rubinstein, Ariel. 1988. "Similarity and Decision- "Behavioural Modelling of Road Users: Current Making under Risk (Is There a Utility Theory Research and Future Needs." Transport Reviews, Resolution to the ?)" Journal of 25(2): 159-80. Economic Theory, 46(1): 145-53. Thorngate, Warren. 1980. "Efficent Decision Rumelhart, Donald L., and James G. Greeno. 1971. Heuristics." Behavioral Science, 25(3): 219-25. "Similarity Between Stimuli: An Experimental Test Thurstone, Louis L. 1927. "A Law of Comparative Rieskamp, Busemeyer, and Mellers: Extending the Bounds of Rationality 661

Judgment." Psychological Review, 34(4): 273-86. 111(3): 757-69. Thurstone, Louis L. 1959. The Measurement of Values. Wakker, Peter, Ido Erev, and Elke U. Weber. 1994. Chicago: Press. "Comonotonic Independence: The Critical Test Train, Kenneth E. 2003. Discrete Choice Models with between Classical and Rank-Dependent Utility Simulation. Cambridge: Cambridge University Press. Theories." Journal of Risk and Uncertainty, 9(3): Tversky, Amos. 1969. "Intransitivity of Preferences." 195-230. Psychological Review, 76(1): 31-48. Wedell, Douglas H. 1991. "Distinguishing Among Tversky, Amos. 1972a. "Choice by Elimination." Models of Contextually Induced Preference Journal of Mathematical Psychology, 9(4): 341-67. Reversals." Journal of Experimental Psychology: Tversky, Amos. 1972b. "Elimination By Aspects: A Theory Learning, Memory, and Cognition, 17(4): 767-78. of Choice." Psychological Review, 79(4): 281-99. Wedell, Douglas H., and Jonathan C. Pettibone. 1996. Tversky, Amos, and J. Edward Russo. 1969. "Using Judgments to Understand Decoy Effects in "Substitutability and Similarity in Binary Choices." Choice." Organizational Behavior and Human Journal of Mathematical Psychology, 6(1): 1-12. Decision Processes, 67(3): 326-44. Tversky, Amos, and Itamar Simonson. 1993. "Context Wernerfelt, Birger. 1995. "A Rational Reconstruction Dependent Preferences." Management Science, of the Compromise Effect: Using Market Data to 39(10): 1179-89. Infer Utilities." Journal of Consumer Research, Tversky, Amos, Paul Slovic, and Daniel Kahneman. 21(4): 627-33. 1990. "The Causes of Preference Reversal." von Winterfeldt, Detlof, and Ward Edwards. 1986. American Economic Review, 80(1): 204-17. Decision Analysis and Behavioral Research. Usher, Marius, and James L. McClelland. 2004. Cambridge:"Loss Cambridge University Press. Aversion and Inhibition in Dynamical Models Yaari, of Menahem E. 1987. "The Dual Theory of Choice Multialternative Choice." Psychological Review, under Risk." Econometrica, 55(1): 95-115.