The Bounds of Reason: Game Theory for the Behavioral Sciences
The Bounds of Reason: Game Theory for the Behavioral Sciences
Herbert Gintis
Princeton University Press Princeton, New Jersey Copyright c 2008 by Princeton University Press Published by Princeton University Press, 41 William Street, Princeton, New Jersey 08540 In the United Kingdom: Princeton University Press, 3 Market Place, Woodstock, Oxfordshire OX20 1SY All Rights Reserved Library of Congress Cataloging-in-Publication Data Gintis, Herbert The Bounds of Reason Herbert Gintis p. cm. Includes bibliographical references and index. ISBN 0-691-00942-2 (cloth: alk. paper)—ISBN 0-691-00943-0 (pbk.: alk. paper) 1. Game theory. 2. Economics, Mathematical. I. Title.
HB144.G562008 330’.01’5193–dc21 99-054923
This book has been composed in Times and Mathtime by the author
The paper used in this publication meets the minimum requirements of ANSI/NISO Z39.48- 1992 (R1997) (Permanence of Paper)
www.pup.princeton.edu
Printed in the United States of America
10987654321
10987654321 (pbk.) This book is dedicated to Shirley and Gerson Gintis, and Flora and Melvin Greisler, with all my love. There is no sorrow so great that it does not find its background in joy. Neils Bohr
Contents
Preface xii
1 Decision Theory and Human Behavior 2 1.1 Beliefs, Preferences, and Constraints 6 1.2 TheMeaningofRationalAction 7 1.3 Why Are Preferences Consistent? 9 1.4 TimeInconsistency 10 1.5 TheExpectedUtilityPrinciple 14 1.6 TheBiologicalBasisforExpectedUtility 18 1.7 The Allais and Ellsberg Paradoxes 18 1.8 RiskandtheShapeoftheUtilityFunction 21 1.9 Prospect Theory 24 1.10 HeuristicsandBiasesinDecision-making 29
2 Game Theory: Basic Concepts 33 2.1 BigJohnandLittleJohn 33 2.2 The Extensive Form 39 2.3 TheNormalForm 42 2.4 Mixed Strategies 43 2.5 NashEquilibrium 43 2.6 TheFundamentalTheoremofGameTheory 44 2.7 SolvingforMixedStrategyNashEquilibria 45 2.8 ThrowingFingers 46 2.9 BattleoftheSexes 47 2.10 TheHawk-DoveGame 49 2.11 ThePrisoner’sDilemma 49
3 Game Theory and Human Behavior 51 3.1 MethodologicalIssues inBehavioralGameTheory 55 3.2 AnAnonymousMarketExchange 59 3.3 TheRationalityofAltruisticGiving 61 viii Contents
3.4 ConditionalAltruisticCooperation 62 3.5 AltruisticPunishment 64 3.6 StrongReciprocityintheLaborMarket 65 3.7 AltruisticThirdPartyPunishment 68 3.8 AltruismandCooperationinGroups 71 3.9 InequalityAversion 74 3.10 The Trust Game 77 3.11 Character Virtues 79 3.12 TheExternalValidityofLaboratoryExperiments 81
4 Eliminating Dominated Strategies 86 4.1 BackwardInductionandSubgamePerfection 88 4.2 ExperimentalEvidenceonBackwardInduction 90 4.3 The Repeated Prisoner’s Dilemma 92 4.4 Reciprocators 92 4.5 RationalizabilityandDominatedStrategies 94 4.6 TheCentipedeGame 96 4.7 RationalizabilityintheCentipedeGame 98 4.8 The Traveler’s Dilemma 98 4.9 AWeaknessoftheRationalizabilityCriterion 99
5 The Incoherence of Backward Induction 102 5.1 TheInternalInconsistencyofBackwardInduction 102 5.2 HowtoPlaytheCentipedeGame 104 5.3 TheSurpriseExamination 106 5.4 DominatedStrategiesintheExtensiveForm 106 5.5 The Chain Store Paradox Resolved 107 5.6 CautiousRationalizability 107 5.7 ExamplesofCautiousRationalizability 109
6 The Mixing Problem 110 6.1 Harsanyi’sPurificationTheorem 112 6.2 Epistemic Games: Mixed Strategies as Conjectures 114 6.3 Resurrecting the Conjecture Approach to Purification 117
7 Bayesian Rationality and Social Epistemology 119 7.1 Introduction 119 7.2 Alice,Bob,andtheChoreographer 120 Contents ix
7.3 AnEfficiency-enhancingChoreographer 121 7.4 CorrelatedStrategiesandCorrelatedEquilibria 122 7.5 TheGeometricStructureofCorrelatedEquilibria 124 7.6 CorrelatedEquilibriumandBayesianRationality 126 7.7 WhyPlayaCorrelatedEquilibrium? 126 7.8 SocialEpistemology 127 7.9 SocialNorms 130 7.10 GameTheoryandtheEvolutionofNorms 130 7.11 The Merchants’ Wares 131 7.12 TheCharacterofSocialNorms 132 7.13 Common Priors 133
8 Epistemic Game Theory 136 8.1 TheModalLogicofKnowledge 137 8.2 BackwardInductionandRationality 140 8.3 Conditions for Nash Equilibrium in Two-Player Games 142 8.4 AThree-playerCounterexample 144 8.5 CounterexampleII 145 8.6 TheModalLogicofCommonKnowledge 147 8.7 TheCommonalityofKnowledge 150 8.8 The Tactful Ladies 151 8.9 The Tactful Ladies and the Commonality of Knowledge 154 8.10 AgreeingtoDisagree 155 8.11 TheDemiseofMethodologicalIndividualism 158
9 Reflective Reason and Equilibrium Refinements 160 9.1 IncredibleThreatsandTremblingHands 162 9.2 The Local Best Response Criterion 163 9.3 AForwardInductionEquilibrium 164 9.4 Selten’s Horse 166 9.5 TheSpenceSignalingModel 167 9.6 The LBR CriterionIgnoresIrrelevantNode Changes 169 9.7 TheLBRCriterioninaMoneyBurningGame 170 9.8 BeerandQuichewithouttheIntuitiveCriterion 171 9.9 ThePrincipleofInsufficientReason 172
10 The Analytics of Human Sociality 174 10.1 ExplainingCooperation:AnOverview 174 x Contents
10.2 BobandAliceRedux 177 10.3 TheFolkTheorem 179 10.4 The Folk Theorem with Imperfect Public Information 182 10.5 CooperationwithPrivateSignaling 188 10.6 PurificationwithPrivateInformation 189 10.7 AReputationalModelofHonestyandCorruption 190 10.8 HonestyandCorruptionII 192 10.9 AltruisticPunishinginthePublicGoodsGame 193
References 199
Preface
The eternal mystery of the world is its comprehensibility. Albert Einstein
Mathematics without natural history is sterile, but natural history without mathematics is muddled. John Maynard Smith
Game theory is central to understanding the dynamics of life forms in general, and humans in particular. Living creatures not only play games, but dynamically transform the games they play, and have themselves thereby evolved their unique identities. For this reason, the material in this book is foundational to all the behavioral sciences, from biology, psychology and economics to anthropology, sociology, and political science. We humans have a completely stunning capacity to reason, and to apply the fruits of reason to the transformation of our social existence. Social interactions in a vast array of species can be analyzed with game theory, yet only humans are capable of playing a game after being told its rules. This book is based on the appreciation that evolution and reason interact in constituting the social life and strategic interaction of humans. Game theory, however, is not everything. This book systematically and single-mindedly denies one of the guiding prejudices of contemporary game theory. This is the notion that game theory and the implications of Bayesian rationality (developed in chapter 1) are sufficient, at least insofar as human beings are rational, to explain all of human social existence. In fact, game theory is complementary to ideas developed and championed in all the be- havioral disciplines. Behavioral scientists who have rejected game theory in reaction to the extravagant claims of some of its adherents may thus want to reconsider their position, recognizing the fact that, just as game theory without broader social theory is merely technical bravado, so social theory without game theory is a fatally handicapped enterprise. The bounds of reason are not irrationality, but sociality. As we stress throughout this book, the assumption that humans engaged in strategic in- teraction are rational is an excellent and indispensable first (and perhaps even second and third) approximation. However, humans have what we may called a social epistemology, meaning that we have reasoning pro- cesses that afford us forms of knowledge and understanding, especially un- derstanding of the content of other minds, that are unavailable to merely “rational” creatures. It is these epistemological features that characterize our species, and bid us to develop epistemic game theory in the pages that follow. That game theory does not stand alone entails denying methodological individualism, a philosophical position asserting that all social phenomena can be explained purely in terms of the characteristics of rational agents, the actions available to them, and the constraints that they face. This position is incorrect because, as we shall see, human society is a system with emer- gent properties, including social norms, that cannot be derived as product of interacting rational agents, any more than the chemical and biological prop- erties of matter can be derived from the behavior of fundamental particles. Moreover, humans have a social epistemology—evolved cognitive capaci- ties that permit the sharing of beliefs across agents in a manner completely opaque to classical and even standard epistemic game theory. Evolutionary game theory often succeeds where classical game theory fails (Gintis 2009). The evolutionary approach to strategic interaction helps us understand the emergence, transformation, and stabilization of behav- iors. In evolutionary game theory, successful strategies diffuse across pop- ulations of players rather than being learned inductively by disembodied rational agents. Moreover, reasoning is costly, so rational agents will not often even attempt to learn optimal strategies to complicated games, but rather will copy the behavior of successful agents whom they encounter. Evolutionary game theory allows us to investigate the interaction of learn- ing, mutation, and imitation in the spread of strategies when information processing is costly. But, evolutionary game theory cannot deal with unique events, such as strangers interacting in a novel environment, or Middle East peace nego- tiations. Moreover, by assuming that agents have very low level cognitive capacities, evolutionary game theory ignores one of the most important of human capacities, that of being able to reason. This book champions a unified approach based on modal logic, epistemic game theory, and social epistemology as an alternative to classical and a supplement to evolutionary game theory. xiv Preface
This approach holds that human behavior is most fruitfully modeled as the interaction of rational agents with a social epistemology, in the context of social norms that act as correlating devices that choreograph social inter- action. This approach challenges contemporary sociology, which rejects the rational actor model. My response to the sociologists is that this rejection is reason sociological theory has atrophied since the death of Talcott Parsons. This approach also challenges contemporary social psychology, which not only rejects the rational actor model, but generally delights in uncovering human “irrationalities.” My response to the social psychologists is that this rejection accounts for the absence of a firm analytical base for the disci- pline, which must content itself with a host of nano-models that illuminate highly specific aspects of human functioning with no analytical linkages among them. Game theory is a tool for investigating the world. By allowing us to specify carefully the conditions of social interaction (player characteristics, rules, informational assumptions, payoffs), its predictions can be tested, and the results can be replicated in different laboratory settings. For this rea- son, behavioral game theory has become increasingly influential in affect- ing research priorities. This aspect of game theory cannot be overstressed, because the behavioral sciences currently consist of some fields, like eco- nomics, where theory has evolved virtually without regard for the facts. Economic theory has been particularly compromised by its neglect of the facts concerning human behavior. This situation became clear to me in the summer of 2001, when I happened to be reading a popular introductory graduate text in quantum mechanics, as well as a leading graduate text in microeconomics. The physics text began with the anomaly of black body radiation, which was inexplicable using the standard tools of electromag- netic theory. In 1900, Max Planck derived a formula that fit the data per- fectly, assuming that radiation was discrete rather than continuous. In 1905, Albert Einstein explained another anomaly of classical electromagnetic the- ory, the photoelectric effect, using Planck’s trick. The text continued, page after page, with new anomalies (Compton scattering, the spectral lines of elements of low atomic number, etc.) and new, partially successful models explaining the anomalies. This culminated in about 1925 with Heisenberg’s wave mechanics and Schr¨odinger’s equation, which fully unified the field. By contrast, the microeconomics text, despite its beauty, did not contain a single fact in the whole thousand page volume. Rather, the authors build economic theory in axiomatic fashion, making assumptions on the basis Preface xv
of their intuitive plausibility, their incorporation of the “stylized facts” of everyday life, or their appeal to the principles of rational thought. A bounty of excellent economic theory was developed in the Twentieth century in this manner. But, the well has now run dry. We will see that empirical evidence challenges the very foundations of both classical game theory and neoclassical economics. Future advances in economics will require that model-building dialogue with empirical testing, behavioral data-gathering, and agent-based models. A simple generalization can be made: decision theory has developed valid algorithms by which people can best attain their objectives. Given these ob- jectives, when people fail to act as predicted by decision theory, the theory is almost always correct and the observed behavior faulty. Indeed, when de- viations from theoretical predictions are pointed out to intelligent individ- uals, they generally agree that they have erred. By contrast, the extension of decision theory to the strategic interaction of Bayesian decision-makers has led to a limited array of useful principles. Surprisingly, most applied game theorists remain unaware of this fact. Rather, the contemporary culture of game theory (as measured by what is accepted without complaint in a journal article) is to act as if epistemic game theory, which has flourished in the past two decades in the economic theory journals, did not exist. Thus, it is virtually universal to assume that agents will play a Nash equilibrium, that it is rational to used backward induction, to play mixed strategies, and to use only subgame perfect strategies. When people do not act that way, their rationality is called into question, whereas in fact, none of these assumptions can be successfully defended. The reason for the inability of decision theory to extend to strategic in- teraction is quite simple. Decision theory shows that when a few plausible axioms hold, we can model agents has having beliefs (subjective priors), and a utility function over outcomes such that the agent’s choices maximize the expected utility of the outcomes. In strategic interaction, there is noth- ing guaranteeing that all interacting parties have mutually consistent beliefs. Yet, as we shall see, a high degree of inter-subjective belief consistency is required to ensure that agents will play Nash equilibrium strategies. Economic theory is comfortable with heterogeneous preferences, but re- quires homogeneous beliefs to predict coordinated behavior. Yet, there is nothing in Bayesian rationality that guarantees homogeneity of beliefs un- der plausible circumstances. xvi Preface
With the exception of biology, which has strong ties to natural science, the behavioral sciences have yet to adopt a serious commitment to linking fundamental theory and empirical research. This is evident from the fact that the various behavioral disciplines hold distinct and incompatible mod- els of human behavior, yet their leading theoreticians and experimentalists make no attempt to adjudicate these differences (Gintis 2007b). Within economics there have been stunning advances in both theory and data in the past two decades, yet theoreticians and experimentalists retain a hos- tile attitude to each other’s work. Consider a recent interview by Sergiu Hart (2005) with Nobel Prize winner and chief theorist of epistemic game theory, Robert Aumann. Hart asks, “If I may summarize your views on this,...much of what is done in experimental workis artificial....” Aumann responds, “Yes.” The fact that theorists criticize and fear the experimental- ists is an embarrassing mark of the infancy of economics as a science. Conversely, many experimentalists ignore and dislike the mathematical abstractions so dear to the theorists. I doubt that more than a few experi- mentalists understand, even vaguely, what Robert Aumann’s contributions to game theory are, for example. Behavioral game theory has gone far cap- turing the low-hanging fruit, by showing that the traditionalmodel of Homo economicus is incorrect. It is now time to move on, learn modern epistemic game theory, and test models that more of a chance of being correct. It is often said that the mathematical rigor of contemporary economic theory is due to the economists’ “physics-envy.” In fact, physicists gen- erally judge models according to their ability to account for the facts, not their mathematical rigor. Physicists generally believe that rigor is the en- emy of creative physical insight, and they leave rigorous formulations for the mathematicians. In fact, economic theorists overvalue rigor precisely because they undervalue explanatory power. I follow the physicists in this book. The justification of a model is its capacity to explain how the world works, not the degree of rigor with which it was derived fromfirst principles. Game theory can be used very profitably by researchers who do not know or care about mathematical intricacies, but rather treat mathematics as one of several tools commonly deployed in the search for truth. I assert, then that my arguments are correct and logically argued. I will leave rigor to the mathematicians. In Gintis (2009), I stress that understanding game theory requires solv- ing lots of problems. I also stress therein that many of the weaknesses of classical game theory have beautiful remedies in evolutionary game the- Preface xvii ory. Neither of these considerations is dealt with in The Bounds of Reason, so I invite the reader to treat Game Theory Evolving as a complementary treatise. The generous support of the John D. and Catherine T. MacArthur Foun- dation, the intellectual environments of the Santa Fe Institute, Central Eu- ropean University (Budapest), and the University of Siena afforded me the time, resources, and research atmosphere to complete The Bounds of Rea- son. I would also like to thank Robert Aumann, Robert Axtell, Pierpaolo Battigalli, Larry Blume, Cristina Bicchieri, Ken Binmore, Samuel Bowles, Robert Boyd, Adam Brandeburger, Songlin Cai, Colin Camerer, Graciela Chichilnisky, Catherine Eckel, Jon Elster, Armin Falk, Ernst Fehr, Alex Field, Urs Fischbacher, Daniel Gintis, Jack Hirshleifer, Sung Ha Hwang, David Laibson, Michael Mandler, Yusuke Narita, Larry Samuelson, Rajiv Sethi, Giacomo Sellari, E. Somanathan, Lones Smith, Peter Vanderschraaf, Muhamet Yildiz, and Eduardo Zambrano for helping me with particular points. Thanks go especially to Tim McCarthy, Seth Ditchik, and Peter Dougherty, my editors at Princeton University Press, who had the faith in me to make this volume possible. 1 Decision Theory and Human Behavior
People are not logical. They are psychological. Anonymous
People often make mistakes in their maths. This does not mean that we should abandon arithmetic. Jack Hirshleifer Decision theory is the analysis of the behavior of an individual facing nonstrategic uncertainty—that is, any uncertainty is due to “Nature” (coin flip, crop loss, death, and the like) or, if other individuals are involved, their behavior is treated as a statistical distribution known to the decision-maker. Decision theory depends on probability theory, which was developed in the Seventeenth and Eighteenth centuries by such notables as Blaise Pas- cal (1623-1662), Daniel Bernoulli (1700-1782), and Thomas Bayes (1702- 1761). We begin by defining the notion of a rational actor as an individual with consistent preferences (1.1). There is more nonsense written by smart peo- ple interpreting the rational actor model than any other area of behavioral science. This situation flows from three facts. The first is that economists, the group relying most strongly on the rational actor model, have histori- cally insisted that rational individuals are selfish . If rationality implied self- ishness, the only rational individuals would be what we term sociopaths—a distinctly abnormal set of characters. The second fact is that the term “ra- tional” is used in many different and often vague and conflicting ways. The philosophers have not yet succeeded in untangling the resulting intellectual mess, so it seems wise to avoid the issue where possible. The third reason is that beliefs, called subjective priors in decision theory, which logically stand between choices and payoffs, are taken as primitive data by the ratio- nal actor model, whereas in fact, beliefs are the product of social processes. Of course, preferences are also socially determined, but economic theory is comfortable with preference heterogeneity, whereas a commonality of belief, as we shall see, is a prerequisite to effective coordination in games.
2 Decision Theory and Human Behavior 3
This theoretical lacuna must be filled before the rational actor model can fulfill the grand expectations of its supporters. To stress the importance of the interplay between beliefs and preferences in modeling choice, we often describe the rational actor model as the beliefs, preferences and constraints model, which abbreviates to the BPC model—(. The BPC terminology has the added attraction of avoiding the confusing and value-laden term “ratio- nal.” The BPC model requires only preference consistency, which can be de- fended on basic evolutionary grounds. While there are eminent critics of preference consistency, their claims are valid only in a few narrow areas. Because preference consistency does not presuppose unlimited informa- tional processing capacities and perfect knowledge, even bounded rational- ity (Simon 1982) is consistent with the BPC model.1 Because one cannot do behavioral game theory, by which I mean the application of game theory to the experimental study of human behavior, without assuming preference consistency, we must accept this axiom to avoid the analytical weaknesses of the behavioral disciplines that reject the BPC model, including psychol- ogy, anthropology and sociology (Gintis 2007b). Behavioral decision theorists have argued that there are important areas in which individuals appear to have inconsistent preferences. Except when in- dividualsdo not know their own preferences, this is a conceptual error based on a misspecification of the decision-maker’s preference function. We show in this chapter that, assuming individuals know their preferences, adding in- formation concerning the current state of the individual to the choice space eliminates preference inconsistency. Moreover, this addition is completely reasonable, because preferences functions do not make any sense unless we include information about the decision-maker’s current state. When we are hungry, or scared, or sleepy, or sexually deprived, our preference ordering will adjust accordingly. The idea that we should have a utility function that does not depend on our current wealth, the current time, or our current strategic circumstances, is also not plausible. Traditional decision theory ignores the individual’s current state, but this is just an oversight that be- havioral game theory has brought to our attention. Compelling experiments in behavioral decision theory show that humans violate the principle of expected utility in systematic ways(1.7). Again,
1Indeed, it can be shown (Zambrano 2005) that every boundedly rational individual is a fully rational individual subject to an appropriate set of Bayesian priors concerning the state of nature. 4 Chapter1
is must be stressed that this does not imply that humans violate preference consistency over the appropriate choice space, but rather that they have incorrect beliefs deriving from what might be termed “folk theory” and make systematic performance errors in important cases. To understand why this is the case, we begin by noting that, with the ex- ception of hyperbolic discounting when time is involved (1.4), there are no reported failures of the expected utility theorem in non-humans, and there are some extremely beautiful examples of its satisfaction (Real 1991).2 The difference between humans and other animals is that the latter are tested in real life, or in elaborate simulations of real life, as in Leslie Real’s work with bumblebees (1991), where subject bumblebees are released into elab- orate spatial models of flowerbeds. Human, by contrast, are tested using imperfect analytical models of real-life lotteries. While it is important to know how humans choose in such situations, there is certainly no guarantee they will make the same choices in the real-life situation and in the situ- ation analytically generated to represent it. Evolutionary game theory is based on the observation that individuals are more likely to adopt behaviors that appear to be successful for others. A heuristic that says “adopt risk pro- files that appear to have been successful to others” may lead to preference consistency even when individuals are incapable of evaluating analytically presented lotteries in the laboratory. In addition to the explanatory success of theories based on the BPC model, supporting evidence from contemporary neuroscience suggests that expected utility maximization is not simply an “as if” story. In fact, the brain’s neural circuitry actually makes choices by internally representing the payoffs of various alternatives as neural firing rates, and choosing a maximal such rate (Shizgal 1999, Glimcher 2003, Dorris and Glimcher 2003, Glimcher and Rustichini 2004, Glimcher, Dorris and Bayer 2005). Neuroscientists increasingly find that an aggregate decision making process in the brain synthesizes all available information into a single, unitary value (Parker and Newsome 1998, Schall and Thompson 1999). Indeed, when animals are tested in a repeated trial setting with variable reward, dopamine neurons appear to encode the difference between the reward that an animal expected to receive and the reward that an animal actually received on a par- ticular trial (Schultz, Dayan and Montague 1997, Sutton and Barto 2000), an evaluation mechanism that enhances the environmental sensitivity of the animal’s decision making system. This error-prediction mechanism has the
2Territoriality in many species is an indication of loss aversion (Gintis 2007a). Decision Theory and Human Behavior 5
drawback of only seeking local optima (Sugrue, Corrado and Newsome 2005). Montague and Berns (2002) address this problem, showing that the orbitofrontal cortex and striatum contains a mechanism for more global predictions that include risk assessment and discounting of future rewards. Their data suggest a decision making model that is analogous to the famous Black-Scholes options pricing equation (Black and Scholes 1973). The existence of an integrated decision-making apparatus in the human brain itself is predicted by evolutionary theory. The fitness of an organism depends on how effectively it make choices in an uncertain and varying environment. Effective choice must be a function of the organism’s state of knowledge, which consists of the information supplied by the sensory inputs that monitor the organism’s internal states and its external environ- ment. In relatively simple organisms, the choice environment is primitive and distributed in a decentralized manner over sensory inputs. But, in three separate groups of animals, the craniates (vertebrates and related creatures), arthropods (including insects, spiders, and crustaceans) and cephalopods (squid, octopuses, and other mollusks) a central nervous system with a brain (a centrally located decision making and control apparatus) evolved. The phylogenetic tree of vertebrates exhibits increasing complexity through time, and increasing metabolic and morphological costs of maintaining brain activity. Thus, the brain evolved because larger and more complex brains, despite their costs, enhanced the fitness of their carriers. Brains therefore are ineluctably structured to make consistent choices in the face of the various constellations of sensory inputs their bearers commonly ex- perience. Before the contributions of Bernoulli, Savage, von Neumann and other experts, no creature on Earth knew how to value a lottery. The fact that people do not know how to evaluate abstract lotteries does not mean that people lack consistent preferences over the lotteries that face in their daily lives. Despite these provisos, experimental evidence on choice under uncer- tainty is still of great importance because in the modern world we are in- creasingly called upon to make such “unnatural” choices based on scientific evidence concerning payoffs and their probabilities. 6 Chapter1
1.1 Beliefs, Preferences, and Constraints In this section we develop a set of behavioral properties, among which con- sistency is the most prominent, that together ensure that we can model the individual as though he were maximizing a preference function over out- comes. A binary relation ˇA on a set A is a subset of A A. We usually write the proposition .x;y/ 2 ˇA as x ˇA y. For instance, the arithmetical operator ‘less than’ (<) is a binary relation, where .x;y/ 2< is normally 3 written x 1. Complete: x A y or y A x; 2. Transitive: x A y and y A z imply x A z; 3. Independence from Irrelevant Alternatives: For x;y 2 B, x B y if and only if x A y. Because of the third property, we need not specify the choice set, and can simply write x y. We also make the behavioral assumption that given any choice set A, the individual chooses an element x 2 A such that for all y 2 A, x y. When x y, we say “x is weakly preferred to y”. Completeness implies that any member of A is weakly preferred to itself (for any x in A, x x). In general, we say a binary relation ˇ is reflexive if, for all x, x ˇ x. Thus, completeness implies reflexivity. We refer to as “weak preference” in contrast with “strong preference” . We define as x y to mean “it is false that y x.” We say x and y are equivalent if x y and y x, and we write x ' y. As an exercise, you may use elementary logic to prove that if satisfies the completeness condition, then satisfies the following exclusion condition: if x y, then it is false that y x. The second condition is transitivity, which says that x y and y z imply x z. It is hard to see how this condition could fail for anything we might like to call a “preference ordering.”4 As a exercise, you may show that x y and y z imply x z, and x y and y z imply 3Additionalbinary relations over the set R of real numbers include “>”,“<”,“Ä”, “D”, “”, and “¤”, but “C” is not a binary relation, because x C y is not a proposition. 4The only plausible model of intransitivitywith some empirical support is regret theory (Loomes 1988, Sugden 1993). Their analysis applies, however, only to a narrow range of choice situations. Decision Theory and Human Behavior 7 x z. Similarly, you may use elementary logic to prove that if satisfies the completeness condition, then ' is transitive (i.e., satisfies the transitivity condition). When these three conditions are satisfied, we say the preference relation is consistent. If is a consistent preference relation, then there always exists a preference function such that the individual behaves as if maximiz- ing this preference function over the set A from which he or she is con- strained to choose. Formally, we say that a preference function uWA!R represents a binary relation if, for all x;y 2 A, u.x/ u.y/ if and only if x y. We have THEOREM 1.1 A binary relation on the finite set A of payoffs can be represented by a preference function uWA!R if and only if is consistent. It is clear that u./ is not unique, and indeed, we have the following. THEOREM 1.2 If u./ represents the preference relation and f./ is a strictly increasing function, then v./ D f.u.// also represents . Con- versely, if both u./ and v./ represent , then there is an increasing func- tion f./ such that v./ D f.u.//. The first half of the theorem is true because if f is strictly increasing, then u.x/ > u.y/ implies v.x/ D f.u.x// > f.u.y// D v.y/, and conversely. For the second half, suppose u./ and v./ both represent , and for any y 2 R such that v.x/ D y for some x 2 X, let f.y/ D u.v 1.y//, which is possible because v is an increasing function. Then f./ is increasing (because it is the composition of two increasing functions) and f.v.x// D u.v 1.v.x/// D u.x/, which proves the theorem. 1.2 The Meaning of Rational Action The origins of the BPC model lie in Nineteenth Century Utilitarianism, and in the works of Jeremy Bentham (1748-1832) and Cesare Beccaria (1738– 1794), who interpreted “utility” as happiness. In his famous Foundations of Economic Analysis (1947), economist Paul Samuelson removed the he- donistic assumptions of utility maximization by arguing, as we have in the previous section, that utility maximization presupposing nothing more than transitivity and some harmless technical conditions akin to those specified above. The rational actor model has been widely misrepresented by those who embrace it, and hence misunderstood by those who do not. The most promi- 8 Chapter1 nent misunderstanding is that rational actors are self-interested. For in- stance, if two rational individuals bargain over the division of money they jointly earned, it is thought that rational action requires that each individual must try to maximize his share. Similarly, it is thought that a rational actor who votes in an election must be motivated by self-interest, and will vote for the candidate that is most likely to secure his personal gain. In fact, there is nothing “irrational” about caring for others, believing in fairness, or sacrificing for a social ideal. Nor do such preferences contradict decision theory. For instance, suppose a man with $100 is considering how much to consume himself, and how much to give to charity. Suppose he faces a tax or subsidy such that for each $1 he contributes to charity, he is obliged to pay $p>0. Thus, p>1 represents a tax, while p<1 represents a subsidy. We can then treat p as the price of a unit contribution to charity, and model the individual as maximizing his utility for personal consumption x and contributions to charity y, say u.x;y/ subject to the budget constraint x C py D 100. Clearly, it is perfectly rational for him to choose y >0. Indeed, in a recent paper, Andreoni and Miller (2002) have shown that in making choices of this type, consumers behave in the same way as they do when choosing among personal consumption goods; i.e., they satisfy the so-called generalized axiom of revealed preference. A second misconception is that decision theory requires that the choices people make be welfare-improving, when in fact people often are slave to passions or upholders of norms that are self-harming. For instance, it is often held that people are irrational when they choose to smoke cigarettes, engage in unsafe sex, commit crimes in the face of extremely heavy penal- ties, or sacrifice their health to junk food consumption. That these behaviors do not in any way violate the principles of preference consistency. A third misconception is that if humans fail to behave as prescribed by decision theory, they are irrational. In fact, they may simply be ignorant or misinformed. However, if human subjects consistently make intransitive choices over lotteries (e.g., 1.7), then either they do not satisfy the axioms of expected utility theory, or they do not know how to evaluate lotteries. The latter is often called performance error. Performance error can be reduced or eliminated by formal instruction, so that experts that society relies upon to make efficient decisions may behave quite rationally, even in cases where the average individual violates preference consistency. Decision Theory and Human Behavior 9 1.3 Why Are Preferences Consistent? The traditional answer to the question, “Why are Preferences Consistent?” is that humans are “rational beings,” and preference consistency satisfies the “demands of reason.” The problem with this justification is that the decision theory often applies extremely well to nonhuman species, includ- ing insects and plants (Real 1991, Alcock 1993, Kagel, Battalio and Green 1995), whose mental apparatus falls far short of the capacity to exercise “rational thought.” It is, of course, possible that there is one explanation for humans and an altogether different explanation for other living things, but this seems rather far-fetched. A more cogent explanation of preference consistency comes from evolu- tionary biology (Robson 1995). Biologists define the “fitness” of an organ- ism as its expected number of offspring. Assume, for simplicity, asexual reproduction. A maximally fit individual will then produce the maximal expected number of offspring, each of which will inherit the genes for max- imal fitness. Thus, fitness maximization is a precondition for evolutionary survival. If organisms maximized fitness directly, the conditions of decision theory would be directly satisfied, because we could simply represent the organism’s utility function as its fitness. However, we know that organisms (including humans) do not directly maximize fitness. For instance, moths fly into flames, few animals are ca- pable of avoiding automobiles in the road, and humans voluntarily limit family size. In fact, biological fitness is a theoretical abstraction unknown to virtually every real-life organism. Rather than literally maximizing fit- ness, organisms have preference orderings that are themselves subject to selection according to their ability to promote fitness (Darwin 1872). We can expect preferences to satisfy the completeness condition because an or- ganism must be able to make a choice in any situation it habitually faces, or it will be out-competed by another whose preference ordering can make such a choice. For similar evolutionary reasons, we would expect the transitivity condi- tion to hold over choices that have some evolutionary meaning to the ratio- nal individual. Of course, unless the current environment of choice is the same as the historical environment under which the individual’s preference system evolved, we would not expect an individual’s choices to be fitness maximizing, or even welfare-improving. For instance, people in advanced technological societies have a tendency to obesity that can be explained by a “preference for high-calorie foods” that may not be fitness-enhancing to- 10 Chapter1 day, but most likely was in the evolutionary history of our species which, until about 10,000 years ago, reflect the conditions of existence of small hunter-gatherer bands under continual threat of starvation. Because human preferences are deeply affected by culture as well as genes, it is possible that cultural influences on preferences lead to pref- erence inconsistency. I am not aware of any empirical example of such a phenomenon, though. Indeed, much of culture is instrumental, in the form of technical information and social conventions that are deployed in the process of preference maximization, and internalized norms that represent valued individual goals, along side the more mundane material and social needs of the individual. This biological explanation also suggests how preference consistency might fail in an imperfectly integrated organism. Suppose the organism has three decision centers in its brain, and for any pair of choices, majority rule determines which the organism prefers. Suppose the available choices are A, B, and C , and the three decision centers have preferences A B C , B C A, and C A B, respectively. Then when offered A or B, the individual chooses A, when offered B or C , the individual chooses B, and when offered A and C , the individual chooses C . Thus A B C A, and we have intransitivity. Of course, if an objective fitness is associated with each of these choices, Darwinian selection will favor a mutant who suppresses two of the three decision centers, or better yet, integrates them. 1.4 Time Inconsistency Several human behavior patterns appear to exhibit weakness of will, in the sense that if there is a long time period between choosing and experiencing the costs and benefits of the choice, individuals can choose wisely. But when costs or benefits are immediate, people make poor choices, long- run payoffs being sacrificed in favor of immediate payoffs. For instance, smokers may know that their habit will harm them in the long run, but cannot bear to sacrifice the present urge to indulge in favor of a far-off reward of a healthy future. Similarly,a couple in the midst of sexual passion may appreciate that they may well regret their inadequate safety precautions at some point in the future, but they cannot control their present urges. We call this behavior time-inconsistent.5 5For an excellent survey of empirical results in this area, see Frederick, Loewenstein and O’Donoghue (2002). Decision Theory and Human Behavior 11 Are people time consistent? Take, for instance, impulsive behavior. Ec- onomists are wont to argue that what appears to be “impulsive”—cigarette smoking, drug use, unsafe sex, overeating, droppingout of school, punching out your boss, and the like—may in fact be welfare-maximizing for people who have high time discount rates or who prefer acts that happen to have high future costs. Controlled experiments in the laboratory cast doubt on this explanation, indicating that people exhibit a systematic tendency to discount the near future at a higher rate than the distant future (Chung and Herrnstein 1967, Loewenstein and Prelec 1992, Herrnstein and Prelec 1992, Fehr and Zych 1994, Kirby and Herrnstein 1995, McClure, Laibson, Loewenstein and Cohen 2004). For instance, consider the following experiment conducted by Ainslie and Haslam (1992). Subjects were offered a choice between $10 on the day of the experiment or $11 a week later. Many chose to take the $10 without delay. However, when the same subjects were offered $10 to be delivered in a year from the day of the experiment, or $11 to be delivered in a year and a week from the day of the experiment, many of these same subjects, who could not wait a week right now for an extra 10%, prefer to wait a week for an extra 10%, provided the agreed-upon wait is far in the future. It is instructive to see exactly where the consistency conditions are vio- lated in this example. Let x mean “$10 at some time t” andlet y mean “$11 at time t C 7”, where time t is measured in days. Then the present-oriented subjects display x y when t D 0 and y x when t D 365. Thus the ex- clusion condition for is violated, and because the completeness condition for implies the exclusion condition for , the completeness condition must be violated as well. However, time inconsistency disappears if we model the individuals as choosing over a slightly more complicated choice space, in which the dis- tance between the time of choice and the time of delivery of the object cho- sen is explicitly included in object of choice. For instance, we may write x0 to mean $10 delivered immediately and x365 to mean $10 delivered in a year from today, and similarly for y7 and y372. Then the observation that x0 y7 and y372 x365 is no contradiction. Of course, if you are not time consistent, and if you know this fact, you should not expect yourself to carry out your plans for the future when the time comes. Thus, you may be willing to precommit yourself to making these future choices, even at a cost. For instance, if you are saving in year 1 for a purchase in year 3, but you know you will be tempted to spend the 12 Chapter1 money in year 2, you can put it in a bank account that cannot be accessed until the year after next. My teacher Leo Hurwicz called this the “piggy bank effect.” The central theorem on choice over time is that time consistency results from assuming that utility be additive across time periods and the instan- taneous utility function be the same in all time periods, with future utilities discounted to the present at a fixed rate (Strotz 1955). This is called ex- ponential discounting, and is widely assumed in economic models. For in- stance, suppose an individual can choose between two consumption streams x D x0; x1 : : : or y D y0;y1;:::. According to exponential discounting, he has a utility function u.x/ and a constant ı 2 .0;1/ such that the total utility of stream x is given by6 1 k U.x0; x1;:::/ D ı u.xk/: (1.1) XkD0 We call ı the individual’s discount factor. Often we write ı D e r where we interpret r>0 as the individual’s one-period continuous-compounding interest rate, in which case (1.1) becomes 1 rk U.x0; x1;:::/ D e u.xk/: (1.2) XkD0 This form clarifies why we call this “exponential” discounting. The indi- vidual strictly prefers consumption stream x over stream y if and only if U.x/ > U.y/. In the simple compounding case, where the interest accrues at the end of the period, we write ı D 1=.1 C r/, and (1.2) becomes 1 u.x / U.x ; x ;:::/ D k : (1.3) 0 1 .1 C r/k XkD0 Despite the elegance of exponential discounting, observed intertempo- ral choice for humans appears to fit more closely the model of hyperbolic discounting (Ainslie and Haslam 1992, Ainslie 1975, Laibson 1997), first observed by Richard Herrnstein in studying animal behavior (Herrnstein, Laibson and Rachlin 1997), and reconfirmed many times since (Green, 6Throughout this text, we write x 2 .a; b/ for a Myerson, Holt, Slevin and Estle 2004). For instance, continuing the pre- vious example, let zt mean “amount of money delivered at t days from today.” Then let the utility of zt be u.zt / D z=.t C 1/. The value of x0 is thus u.x0/ D u.100/ D 10=1 D 10 and the value of y7 is u.y7/ D u.117/ D 11=8 D 1:375, so x0 y7. But u.x365/ D 10=366 D 0:027 while u.y372/ D 11=373 D 0:029, so y372 x365. There is also evidence that people have different rates of discount for dif- ferent types of outcomes (Loewenstein 1987, Loewenstein and Sicherman 1991). This would be irrational for outcomes that could be bought and sold on perfect markets, because all such outcomes should be discounted at the market interest rate in equilibrium. But, of course, there are many things that people care about that cannot be bought and sold on perfect markets. Neurological research suggests that balancing current and future payoffs involves adjudication among structurally distinct and spatially separated modules that arose in different stages in the evolution of Homo sapiens (Tooby and Cosmides 1992, Sloman 2002, McClure et al. 2004). Long term decision-making capacity is localized in specific neural structures in the prefrontal lobes and functions improperly when these areas are dam- aged, despite the fact that subjects with such damage appear to be other- wise completely normal in brain functioning (Damasio 1994). Homo sapi- ens may be structurally predisposed, in terms of brain architecture, toward a systematic present-orientation. In sum, time inconsistency doubtless exists and is important in model- ing human behavior, but this does not imply that people are “irrational” in the weak sense of preference consistency. Indeed, we can model the behavior of time-inconsistent rational individuals by assuming they maxi- mize their time-dependent preference functions (O’Donoghue and Rabin, 1999a,b, 2000, 2001). For axiomatic treatment of time-dependent prefer- ences, see Ahlbrecht and Weber (1995) and Ok and Masatlioglu (2003). In fact, humans are much closer to time consistency, and have much longer time horizons than any other species, probably by several orders of magni- tude (Stephens, McLinn and Stevens 2002, Hammerstein 2003). We donot know why biological evolution so little values time consistency and long time horizons even in long-lived creatures. 14 Chapter1 1.5 The Expected Utility Principle What about decisions in which a stochastic event determines the payoffs to the players? Let X be a set of “prizes.” A lottery with payoffs in X is a function pWX !Œ0; 1 such that x2X p.x/ D 1. We interpret p.x/ as the probability that the payoff is x P2 X. If X D fx1;:::;xng for some finite number n, we write p.xi / D pi . The expected value of a lottery is the sum of the payoffs, where each pay- off is weighted by the probability that the payoff will occur. If the lottery l has payoffs x1 :::xn with probabilities p1;:::;pn, then the expected value EŒl of the lottery l is given by n EŒl D pi xi : XiD1 The expected value is important because of the law of large numbers (Feller 1950), which states that as the number of times a lottery is played goes to infinity, the average payoff converges to the expected value of the lottery with probability one. Consider the lottery l1 in pane (a) of Fig. 1.1, where p is the probability of winning amount a and 1 p is the probability of winning amount b. The expected value of the lottery is then EŒl1 D pa C .1 p/b. Note that we model a lottery a lot like an extensive form game—except there is only one player. Consider the lottery l2 with the threepayoffsshown in pane (b) of Fig.1.1. Here p is the probability of winning amount a, q is the probability of win- ning amount b, and 1 p q is the probability of winning amount c. The expected value of lottery is EŒl2 D pa C qb C .1 p q/c. A lottery with n payoffs is given in pane (c) of Fig. 1.1. The prizes are now a1;:::;an with probabilities p1;:::;pn, respectively. The expected value of the lottery is now EŒl3 D p1a1 C p2a2 C : : : C pnan. In this section we generalize the previous argument, developing a set of behavioral properties that yield both a utility function over outcomes and a probability distribution over states of nature, such that the expected utility principle holds. Von Neumann and Morgenstern (1944), Friedman and Savage (1948), Savage (1954), and Anscombe and Aumann (1963) showed that the expected utility principle can be derived from the assumption that individuals have consistent preferences over an appropriate set of lotteries. We outline here Savage’s classic analysis of this problem. Decision Theory and Human Behavior 15 a p a p a p1 1 a2 l1 l2 l3 ¢ ¡ q p2 b pn 1 a p n 1 1 p b 1 p q c n an (a) (b) (c) Figure 1.1. Lotteries with two, three, and n potential outcomes. For the rest of this section, we assume is a preference relation (1.1). To ensure that the analysis is not trivial, we also assume that x y is false for at least some x;y 2 X. Savage’s accomplishment was to show that if the individual has a preference relation over lotteries that has some plausible properties, then not only can the individual’s preferences be represented by a utility function, but we can infer the probabilities the individual implicitly places on various events, and the expected utility principle holds for these probabilities. To see this, let ˝ be afinite set of states of nature. We call A  ˝ events. Also, let L be a set of “lotteries,” where a lottery is a function W˝ !X that associates with each state of nature ! 2 ˝ a payoff .!/ 2 X. Note that this concept of a lottery does not include a probability distribution over the states of nature. Rather, the Savage axioms allow us to associate a subjective prior over each state of nature !, expressing the decision-maker’s personal assessment of the probability that ! will occur. We suppose that the individual chooses among lotteries without knowing the state of nature, after which “Nature” choses the state ! 2 ˝ that obtains, so that if the individual chose lottery 2 L, his payoff is .!/. Now suppose the individual has a preference relation over L (we use the same symbol for preferences over both outcomes and lotteries). We seek a set of plausible properties of over lotteries that together allow us to deduce (a) a utility function uWX !R corresponding to the preference relation over outcomes in X; (b) a probability distribution p W ˝ ! R such that the expected utility principle holds with respect to the preference relation over lotteries and the utility function u./; i.e., if we define E ŒuI p D p.!/u..!//; (1.4) !X2˝ 16 Chapter1 then for any ; 2 L, ” E ŒuI p > EŒuI p: Our first condition is that depends only on states of nature where and have different outcomes. We state this more formally as A1. For any ;;0;0 2 L, let A D f! 2 ˝j.!/ ¤ .!/g. Suppose we also have A D f! 2 ˝j0.!/ ¤ 0.!/g. Suppose also that .!/ D 0.!/ and .!/ D 0.!/ for ! 2 A. Then , 0 0. This axiom allows us to define a conditional preference A , where A  ˝, which we interpret as “ is strictly preferred to , conditional on 0 0 0 event A,” as follows. We say A if, for some ; 2 L, .!/ D .!/ and .!/ D 0.!/ for ! 2 A, 0.!/ D 0.!/ for ! … A, and 0 0. Because of A1, this is well defined (i.e., A does not depend on the 0 0 particular ; 2 L). This allows us to define A and A in a similar manner. We then define an event A  ˝ to be null if A for all ; 2 L. Our second condition is then the following, where we write D xjA to mean .!/ D x for all ! 2 A (i.e. D xjA means is a lottery that pays x when A occurs). A2. If A  ˝ is not null, then for all x;y 2 X, D xjA A D yjA , x y. This axiom says that a natural relationship between outcomes and lotteries holds: if pays x given event A and pays y given event A, andif x y, then A , and conversely. Our third condition asserts that the probabilitythat a state of nature occurs is independent from the outcome one receives when the state occurs. The difficulty in stating this axiom is that the individual cannot choose prob- abilities, but only lotteries. But, if the individual prefers x to y, and if A; B  ˝ are events, then the individual treats A as “more probable” than B if and only if a lottery that pays x when A occurs and y when A does not occur will be preferred to a lottery that pays x when B occurs and y when B does not. However, this must be true for any x;y 2 X such that x y, or the individual’s notion of probability is incoherent (i.e., it depends on what particular payoffs we are talking about—for instance, “wishful thinking,” where if the prize associated with an event increases, the individual thinks Decision Theory and Human Behavior 17 it is more likely to occur). More formally, we have the following, where we write D x;yjA to mean “.!/ D x for ! 2 A and .!/ D y for ! … A.” A3. Suppose x y, x0 y0, ;;0;0 2 L, and A; B  ˝. Suppose that D x;yjA, D x0;y0jA, 0 D x;yjB, 0 D x0;y0jB. Then 0 , 0. The fourth conditionis a weak version of first order stochastic dominance, which says that if one lottery has a higher payoff than another for any event, then the first is preferred to the second. A4. For any event A, if x .!/ for all ! 2 A, then D xjA A . Also, for any event A, if .!/ x for all ! 2 A, then A D xjA. In other words, if for any event A, D x on A pays more than the best can pay on A, the A , and conversely. Finally, we need a technical property to show that a preference relation can be represented by a utility function. It says that for any ; 2 L, and any x 2 X, we can partition ˝ into a number of disjoint subsets A1;:::An such that [i Ai D ˝, and for each Ai , if we change so that its payoff is x on Ai , then is still preferred to , and similarly, for each Ai , if we change so that its payoff is x on Ai , then is still preferred to . This means that no payoff is “supergood,” so that no matter how unlikely an event A is, a lottery with that payoff when A occurs is always preferred to a lottery with a different payoff when A occurs. Similarly, no payoff can be “superbad.” The condition is formally as follows: A5. For all ;0;;0 2 L with , and for all x 2 X, there are disjoint subsets A1;:::;An of ˝ such that [i Ai D ˝ and 0 0 for any Ai (a) if .!/ D x for ! 2 Ai and .!/ D .!/ for 0 0 ! … Ai , then , and (b) if .!/ D x for ! 2 Ai and 0 0 .!/ D .!/ for s … Ai , then . We then have Savage’s theorem. THEOREM 1.3 Suppose A1–A5 hold. Then there is a probability function p on ˝ and a utility function uWX !R such that for any ; 2 L, if and only if E ŒuI p > EŒuI p. The proof of this theorem is somewhat tedious (it is sketched in Kreps 1988). 18 Chapter1 We call the probability p the individual’s Bayesian prior, or subjective prior and say that A1–A5 imply Bayesian rationality, because the they together imply Bayesian probability updating. 1.6 The Biological Basis for Expected Utility Suppose an organism must choose from action set X under certain condi- tions. There is always uncertainty as to the degree of success of the various options in X, which means essentially that each x 2 X determines a lottery that pays i offspring with probability pi .x/, for i D 0;1;:::;n. Then the expected number of offspring from this lottery is n .x/ D jpj .x/: jXD1 Let L be a lottery on X that delivers xi 2 X with probability qi for i D 1;:::;k. The probability of j offspring given L is then k qi pj .xi / XiD1 so the expected number of offspring given L is n k k k j qi pj .xi / D qi jpj .xi / jXD1 XiD1 XiD1 XiD1 k D qi .xi /; XiD1 which is the expected value theorem with utility function ./. See also Cooper (1987). 1.7 The Allais and Ellsberg Paradoxes Although most decision theorists consider the expected utility principle ac- ceptably accurate as a basis of modeling behavior—it is, after all, quite central to all of game theory—there are certainly well-established situa- tions in which individuals appear to violate it. Machina (1987) reviews this Decision Theory and Human Behavior 19 body of evidence and presents models to deal with them. We sketch here the most famous of these “anomalies,” the so-called Allais paradox and Ellsberg paradox. They are, of course, not “paradoxes” at all but simply empirical regularities that do not fit the expected utility principle. Maurice Allais (1953) offered the following scenario. There are two choice situations in a game with prizes x D$2,500,000, y D$500,000, and z D$0. The first is a choice between lotteries D y and 0 D 0:1x C 0:89y C 0:01z. The second is a choice between D 0:11y C 0:89z and 0 D 0:1x C 0:9z. Most people, when faced with these two choice situations, choose 0 and 0 . What would you choose? This pair of choices is not consistent with the expected utility principle. To see this, let us write uh D u.2500000/, um D u.500000/, and ul D 0 u.0/. Then if the expected utility principle holds, implies um > 0:1uh C 0:89um C 0:01ul , so 0:11um > 0:10uh C 0:01ul , which implies (adding 0:89ul to both sides) 0:11um C 0:89ul > 0:10uh C 0:9ul , which says 0. Why do people make this mistake? Perhaps because of regret, which does not mesh well with the expected utility principle (Loomes 1988, Sugden 1993). If you choose 0 in the first case and you end up getting nothing, you will feel really foolish, whereas in the second case, you are probably going to get nothing anyway (not your fault), so increasing the chances of getting nothing a tiny bit (0.01) gives you a good chance (0.10) of winning the really big prize. Or perhaps because of loss aversion (1.9), because in the first case, the anchor point (the most likely outcome) is $500,000, while in the second case the anchor is 0. Loss-averse individuals will then shun 0, which gives a positive probability of loss, whereas in the second case, neither lottery involves a loss, from the standpoint of the most likely outcome. The Allais paradox is an excellent illustration of problems that can arise when a lottery is consciously chosen by an act of will, and one knows that one has made such a choice. The regret in the first case arises because, if one chose the risky lottery and the payoff was zero, one knows for certain that one make a poor choice, at least ex post. In the second case, if one receives payoff zero, the odds are that it had nothing to do with your choice. Hence, no regret in the second case. But, in the real world, most of the lotteries we experience are chosen by default, not by acts of will. Thus, if the outcome of such a lottery is poor, we feel bad because of the poor outcome, but not because we made a poor choice. 20 Chapter1 Another classic violation of the expected utility principle was suggested by Daniel Ellsberg (1961). Consider two urns. Urn A has 51 red balls and 49 white balls. Urn B also has 100 red and white balls, but the fraction of red balls is unknown. One ball is chosen from each urn but remains hidden from sight. Subjects are asked to choose in two situations. First, a subject can choose the ball from urn A or urn B, and if the ball is red, the subject wins $10. In the second situation, the subject can choose the ball from urn A or urn B, and if the ball is white, the subject wins $10. Many subjects choose the ball from urn A in both cases. This violates the expected utility principle, no matter what probability the subject places on the probability p that the ball from urn B is white. For in the first situation, the payoff to choosing urn A is 0:51u.10/ C 0:49u.0/, and the payoff from choosing urn B is .1 p/u.10/Cpu.0/, so strictly preferring urn A means p < 0:49. In the second situation, the payoff to choosing urn A is 0:49u.10/ C 0:51u.0/, and the payoff from choosing urn B is pu.10/ C .1 p/u.0/, so strictly preferring urn A means p > 0:49. This shows that the expected utility principle does not hold. Whereas the other proposed anomalies of classical decision theory can be interpreted as the failure of linearity in probabilities, regret, loss aversion, and epistemological ambiguities, the Ellsberg Paradox strikes even more deeply, because it implies that humans systematically violate the following principle of first-order stochastic dominance (FOSD): Let p.x/ and q.x/ be the probabilities of winning x or more in lotteries A and B, respectively. If p.x/ q.x/ for all x, then A B. The following form of the Ellsberg Paradox (Segal 1987) shows that human subjects systematically violate FOSD. Consider an urn with 90 balls, of which 30 are red (R). The other 60 are either black (B) or yellow (Y), but of unknown numbers of each. One ball will be drawn at random from this urn. Consider the following four lotteries: A1: Receive $ 100 if R, otherwise nothing. A2: Receive $ 100 if B, otherwise nothing. B1: Receive $ 100 if R or Y, otherwise nothing. B2: Receive $ 100 if B or Y, otherwise nothing. Ellsberg predicted that most subjects would prefer A1 to A2, but B2 to B1, because in both cases, the preferred lottery has known probabilities. This prediction was verified by Slovik and Tversky (1974) and MacCrimmon Decision Theory and Human Behavior 21 and Larsson (1979). To see that this violates FOSD, note that FOSD implies that if x is preferred to zero, and if a lottery pays x if event S occurs, and nothing otherwise, while a second lottery pays x if event T occurs, and nothing otherwise, then the first lottery is preferred to the second only if the probability of S is greater than the probability of T. However, in this example P.R/ D 1=3,soP.B [ Y/ D 2=3. Thus A1 A2 implies 1=3 D P.R/ > P.B[Y/ P.B/. ThusP.R[Y/ D 1 P.B/ > 2=3 D P.B[Y/. FOSD then implies B1 B2, so the behavior violates FOSD. The usual explanation of this behavior is that the subject knows the prob- abilities associated with the first urn, while the probabilities associated with the second urn are unknown, and hence there appears to be an added degree of risk associated with choosing from the second urn rather than the first. If decision makers are risk-averse, and if they perceive that the second urn is considerably riskier than the first, they will prefer the first urn. Of course, with some relatively sophisticated probability theory, we are assured that there is in fact no such additional risk, it is hardly a failure of rationality for subjects to come to the opposite conclusion. The Ellsberg paradox is thus a case of performance error on the part of subjects, rather than a failure of rationality. 1.8 Risk and the Shape of the Utility Function If is defined over X, we can say nothing about the shape of a utility func- tion u./ representing , because by Theorem 1.2, any increasing function of u./ also represents . However, if is represented by a utility function u.x/ satisfying the expected utility principle, then u./ is determined up to an arbitrary constant and unit of measure.7 THEOREM 1.4 Suppose the utility function u./ represents the preference relation and satisfies the expected utility principle. If v./ is another utility function representing , then there are constants a; b 2 R with a>0 such that v.x/ D au.x/ C b for all x 2 X. For a proof of this theorem, see Mas-Colell, Whinston and Green (1995), p. 173. 7Because of this theorem, the difference between two utilitiesmeans nothing. We thus say utilities over outcomes are ordinal, meaning we can say that one bundle is preferred to another, but we cannot say by how much. By contrast, thenext theorem shows that utilities over lotteries are cardinal, in the sense that, up to an arbitrary constant and an arbitrary positive choice of units, utility is numerically uniquely defined. 22 Chapter1 u.y/ F ¤ £ C u.px C .1 p/y/ I D pu.x/ C .1 p/u.y/ H ¥ B u.x/ u.x/ G ¦ A E x px C .1 p/y y Figure 1.2. A concave utility function. If X D R, so the payoffs can be considered to be “money,” and utility satisfies the expected utility principle, what shape do such utility functions have? It would be nice if they were linear in money, in which case expected utility and expected value would be the same thing (why?). But generally utility will be strictly concave, as illustrated in Fig. 1.2. We say a function uWX !R is strictly concave if, for any x;y 2 X, and any p 2 .0;1/, we have pu.x/ C .1 p/u.y/ < u.px C .1 p/y/. We say u.x/ is weakly concave, or simply concave, if u.x/ is either strictly concave or linear, in which case the above inequality is replaced by pu.x/ C .1 p/u.y/ D u.px C .1 p/y/. If we define the lottery as paying x with probability p and y with probability 1 p, then the condition for strict concavity says that the ex- pected utility of the lottery is less than the utility of the expected value of the lottery, as depicted in Fig. 1.2. To see this, note that the expected value of the lottery is E D px C .1 p/y, which divides the line seg- ment between x and y into two segments, the segment xE having length .px C .1 p/y/ x D .1 p/.y x/, and the segment Ey having length y .px C .1 p/y/ D p.y x/. Thus E divides Œx; y into two segments whose lengths having ratio .1 p/=p. From elementary geometry, it fol- lows that B divides segment ŒA; C into two segments whose lengths have the same ratio. By the same reasoning, point H divides segments ŒF; G into segments with the same ratio of lengths. This means the point H has the coordinate value pu.x/ C .1 p/u.y/, which is the expected utility of Decision Theory and Human Behavior 23 the lottery. But by definition, the utility of the expected value of the lottery is at D, which lies above H. This proves that the utility of the expected value is greater than the expected value of the lottery for a strictly concave utility function. This is know as Jensen’s Inequality. What are good candidates for u.x/? It is easy to see that strict concavity means u00.x/ < 0, providing u.x/ is twice differentiable (which we as- sume). But there are lots of functions with this property. According to the famous Weber-Fechner Law of psychophysics, for a wide range of sensory stimuli, and over a wide range of levels of stimulation, a just-noticeable change in a stimulus is a constant fraction of the original stimulus. If this holds for money, then the utility function is logarithmic. We say an individual is risk averse if the individual prefers the expected value of a lottery to the lottery itself (provided, of course, the lottery does not offer a single payoff with probability 1, which we call a “sure thing”). We know, then, that an individual with utility function u./ is risk averse if and only if u./ is concave.8 Similarly, we say an individual is risk loving if he prefers any lottery to the expected value of the lottery, and risk neutral if he is indifferent between a lottery and its expected value. Clearly, an individual is risk neutral if and only if he has linear utility. Does there exist a measure of risk aversion that allows us to say when one individual is more risk averse than another, or how an individual’s risk aversion changes with changing wealth? We may define individual A to be more risk averse than individual B if whenever A prefers a lottery to an amount of money x, B will also prefer the lottery to x. We say A is strictly more risk averse than B if he is more risk averse, and there is some lottery that B prefers to an amount of money x, but such that A prefers x to the lottery. Clearly, the degree of risk aversion depends on the curvature of the utility function (by definition the curvature of u.x/ at x is u00.x/), but because u.x/ and v.x/ D au.x/ C b (a>0) describe the same behavior, but v.x/ has curvature a time that of u.x/, we need something more sophisticated. 00 0 The obvious candidate is u.x/ D u .x/=u .x/, which does not depend 8One may ask why people play government-sponsored lotteries, or spend money at gambling casinos, if they are generally risk averse. The most plausible explanation is that people enjoy the act of gambling. The same woman who will have insurance on her home and car, both of which presume risk aversion, will gamble small amounts of money for recreation. An excessive love for gambling, of course, leads an individual either to personal destruction or wealth and fame (usually the former). 24 Chapter1 on scaling factors. This is called the Arrow-Pratt coefficient of absolute risk aversion, and it is exactly the measure that we need. We have the following theorem. THEOREM 1.5 An individual with utility function u.x/ is more risk averse than an individual with utility function v.x/ if and only if u.x/ > v.x/ for all x. For example, the logarithmic utility function u.x/ D ln.x/ has Arrow- Pratt measure u.x/ D 1=x, which decreases with x; i.e., as the indi- vidual becomes wealthier, he becomes less risk averse. Studies show that this property, called decreasing absolute risk aversion, holds rather widely (Rosenzweig and Wolpin 1993, Saha, Shumway and Talpaz 1994, Nerlove and Soedjiana 1996). Another increasing concave function is u.x/ D xa for a 2 .0;1/, for which u.x/ D .1 a/=x, which also exhibits decreasing absolute risk aversion. Similarly, u.x/ D 1 x a (a>0) is increasing and concave, with u.x/ D .a C 1/=x, which again exhibits decreasing absolute risk aversion. This utility has the additional attractive property that utility is bounded: no matter how rich you are, u.x/ < 1.9 Yet another can- didate for a utility function is u.x/ D 1 e ax, for some a>0. In this case u.x/ D a, which we call constant absolute risk aversion. Another commonly used term is the coefficient of relative risk aversion, u.x/ D u.x/=x. Note that for any of the utility functions u.x/ D ln.x/, a a u.x/ D x for a 2 .0;1/, and u.x/ D 1 x (a>0), u.x/ is constant, which we call constant relative risk aversion. For u.x/ D 1 e ax (a>0), we have u.x/ D a=x, so we have decreasing relative risk aversion. 1.9 Prospect Theory A large body of experimental evidence indicates that people value payoffs according to whether they are gains or losses compared to their current status quo position. This is related to the notion that individuals “adjust” to an accustomed level of income, so that subjective well-being is associated more with changes in income rather than with the level of income. See, for instance Helson (1964), Easterlin (1974, 1995), Lane (1991, 1993), and Oswald (1997). Indeed, people appear to be about twice as averse to taking 9If utility is unbounded, it is easy to show that there is a lottery that you would be willing to give all your wealth to play, no matter how rich you are. This is not plausible behavior. Decision Theory and Human Behavior 25 losses as to enjoying an equal level of gains(Kahneman, Knetsch and Thaler 1990, Tversky and Kahneman 1981b). This means, for instance, that an individual may attach zero value to a lottery that offers an equal chance of winning $1000 and losing $500. This also implies people are risk-loving over losses, while they remain risk-averse over gains (1.8 explains the concept of risk-aversion). For instance, many individuals will choose a 25% probability of losing $2000 rather than a 50% chance of losing $1000 (both have the same expected value, of course, but the former is riskier). More formally, suppose an individual has utility function v.x r/ where r is the status quo (his current position), and x represents a change from the status quo. Prospect theory, developed by Daniel Kahneman and Amos Tversky, asserts that there is a “kink” in v.x r/ such that the slope of v./ is two to three times as great just to the left of x D r as to the right, and the curvature of v./ is positive for positive values and negative for negative values, and the curvature goes to zero for large positive and negative values. In other words, individuals (a) are two to three times more sensitive to small losses than they are to small gains; (b) exhibit declining marginal utility over gains and declining absolute marginal utility over losses; and (c) are very insensitive to either large gains or large losses. This utility function is exhibited in figure 1.3. psychic payoff value v.x r/ v.0/ § r Money x Figure 1.3. Loss Aversion According to Prospect Theory Experimental economists have long known that the degree of risk- aversion exhibited in the laboratory over small gambles cannot be explained by standard expected utility theory, according to which risk-aversion is mea- sured by the curvature of the utility function (1.8). The problem is that for small gambles the utility function should be almost flat. The issues has re- 26 Chapter1 cently been formalized by Rabin (2000). Consider a lottery that imposes a $100 loss and offers a $125 gain with equal probability p D 1=2. Most sub- jects in the laboratory reject this lottery. Rabin shows that if this is true for all expected lifetime wealth levels under $300,000, then in order to induce a subject to sustain a loss of $600 with probability 1/2, you would have to offer him a gain of at least $36,000,000,000 with probability 1/2. This is, of course, quite absurd. There are many regularities in empirical data on human behavior that fit prospect theory very well (Kahneman and Tversky 2000). For instance, returns on stocks in the United States have exceeded the returns on bonds by about 8 percentage points, averaged over the past 100 years. Assum- ing investors are capable of correctly estimating the shape of the return schedule, if this were due to risk aversion alone, then the average individ- ual would be indifferent between a sure $51,209 and a lottery that paid $50,000 with probability 1/2 and paid $100,000 with probability 1/2. It is, of course, quite implausible that more than a few individuals would be this risk-averse. However, a loss aversion coefficient (the ratio of the slope of the utility function over losses at the kink to the slope over gains) of 2.25 is sufficient to explain this phenomenon. This loss aversion coefficient is very plausible from experiments. In a similar vein, people tend to sell stocks when they are doing well, but hold onto stocks when they are doing poorly. A kindred phenomenon holds for housing sales: homeowners are extremely averse to selling at a loss, and will sustain operating, tax, and mortgage costs for long periods of time in the hope of obtaining a favorable selling price. One of the earliest examples of loss aversion was the so-called ratchet effect discovered by James Duesenberry, who noticed that over the business cycle, when times are good, people spend all their additional income, but when times start to go bad, people incur debt rather than curb consump- tion. As a result, there is a tendency for the fraction of income saved to decline over time. For instance, in one study unionized teachers consumed more when next year’s income was going to increase (through wage bar- gaining), but did not consume less when next year’s income was going to decrease. We can explain this behavior with a simple loss aversion model. A teacher’s utility can be written u.ct rt / C st .1 C /, where ct is con- sumption in period t, st is savings in period t, is the rate of interest on savings, and rt is the reference point (status quo point) in period t. This assumes that the marginal utility of savings is constant, which is a very Decision Theory and Human Behavior 27 good approximation. Now suppose the reference point changes as follows: rtC1 D ˛rt C .1 ˛/ct , where ˛ 2 Œ0; 1 is an adjustment parameter (˛ D 1 means no adjustment, and ˛ D 0 means complete adjustment to last year’s consumption). Note that when consumption in one period rises, the reference point in the next period rises, and conversely. Now, dropping the time subscripts, and assuming the individual has in- come M , so c C s D M , the individual chooses c to maximize u.c r/ C .M c/.1 C /: This gives the first order condition u0.c r/ D 1 C . Because this must hold for all r, we can differentiate totally with respect to r, getting dc u00.c r/ D u00.c r/: dr This shows that dc=dr D 1>0, so when the individual’s reference point rises, his consumption rises and equal amount. One general implication of prospect theory is a status quo bias, according to which people often prefer the status quo over any of the alternatives, but if one of the alternatives becomes the status quo, that too is preferred to any of the alternatives (Kahneman, Knetsch and Thaler 1991). The status quo makes sense if we recognize that any change can involve a loss, and because on the average gains do not offset losses, it is possible that any one of a number of alternatives might be preferred if it is the status quo. For instance, if employers make joining a 401k savings plan the default posi- tion, almost all employees join. If not joining is made the default position, most employees do not join. Similarly, if the state automobile insurance commission declares one type of policy the “default option” and insurance companies ask individual policy-holders how they would like to vary from the default, the policyholders tend not to vary, no matter what the default is (Camerer 2000). Another implication of prospect theory is the endowment effect (Kahne- man et al. 1991), according to which people place a higher value on what they possess than they place on the same things when they do not possess them. For instance, if you win a bottle of wine that you could sell for $200, you may drink it rather than sell it, but you would never think of buying a $200 bottle of wine. A famous experimental result exhibiting the endow- ment effect was the “mug” experiment described by Kahneman, Knetsch and Thaler (1990). College student subjects who are given coffee mugs 28 Chapter1 with the school logo on them demand a price 2 to 3 times as high to sell the mug as those without the mugs are willing to pay to buy the mug. There is evidence that people underestimate the endowment effect, and hence cannot appropriately correct for it in their choice behavior (Loewenstein and Adler 1995). Yet another implication of prospect theory is the existence of a framing effect, whereby one form of a lottery is strictly preferred to another, even though they have the same payoffs with the same probabilities (Tversky and Kahneman 1981a). For instance, people prefer a price of $10 plus a $1 discount to a price of $8 plus a $1 surcharge. Framing is of course closely associated with the endowment effect, because framing usually involves privileging the initial state from which movements are assessed. The framing effect can seriously distort effective decision-making. In par- ticular, when it is not clear what the appropriate reference point is, decision- makers can exhibit serious inconsistency in their choices. Kahneman and Tversky give a dramatic example from health care policy. Suppose we face a flu epidemic in which we expect 600 people to die if nothing is done. If program A is adopted, 200 people will be saved, while if program B is adopted there is a 1/3 probability 600 will be saved and a 2/3 probability no one will be saved. In one experiment, 72% of a sample of respondents preferred A to B. Now suppose that if program C is adopted, 400 people will die, while if D is adopted there is a 1/3 probabilitynobody will die, and a 2/3 probability 600 people will die. Now, 78% of respondents preferred D to C, even though A and C are equivalent in terms of the probability of each final state, and B and D are similarly equivalent. However, in the A vs. B choice, alternatives are over gains, whereas in the C vs. D choice, the alternatives are over losses, and people are loss-averse. The inconsis- tency stems from the fact that there is no natural reference point for the decision-maker, because the gains and losses are experienced by others, not the decision-maker himself. The brilliant experiments by Kahneman, Tversky, and their coworkers clearly show that humans exhibit systematic biases in the way they make decisions. However, it should be clear that none of the above examples illustrates preference inconsistency, once the appropriate parameter (cur- rent time, current position, status quo point) is admitted into the preference function. This point is formally demonstrated in Sugden (2003). Sugden considers a preference relation of the form f gjh, which means “lot- tery f is weakly preferred to lottery g when one’s status quo position is Decision Theory and Human Behavior 29 lottery h.” Sugden shows that if several conditions on this preference re- lation, most of which are direct generalizations of the Savage conditions (1.5), obtain, then there is a utility function u.x;z/ such that f gjh if and only if EŒu.f; h/ EŒu.g; h/, where the expectation is taken over the probability of events derived from the preference relation. 1.10 Heuristics and Biases in Decision-making Laboratory testing of the standard economic model of choice under un- certainty was initiated by the psychologists Daniel Kahneman and Amos Tversky. In a famous article in the journal Science, Tversky and Kahneman (1974) summarized their early research as follows: How do people assess the probability of an uncertain event or the value of an uncertain quantity?...people rely on a lim- ited number of heuristic principles which reduce the complex tasks of assessing probabilities and predicting values to simpler judgmental operations. In general, these heuristics are quite useful, but sometimes they lead to severe and systematic er- rors. Subsequent research has strongly supported this assessment (Kahneman, Slovic and Tversky 1982, Shafir and Tversky 1992, Shafir and Tversky 1995). Although we still do not have adequate models of these heuristics, we can make certain generalizations. First, in judging whether an event or object A belongs to a class or process B, one heuristic that people use is to consider whether A is representative of B but to consider no other relevant facts, such as the frequencyof B. For instance, if informed that an individual has a good sense of humor and likes to entertain friends and family, and asked if the individual is a professional comic or a clerical worker, people are more likely to say the former. This is despite the fact that a randomly chosen person is much more likely to be a clerical worker than a professional comic, and many people have a good sense of humor, so there are many more clerical workers satisfying the description than professional comics. A particularly pointed example of this heuristic is the famous “Linda the bank teller problem” (Tversky and Kanheman 1983). Subjects are given the following description of a hypothetical person named Linda: 30 Chapter1 Linda is 31 years old, single, outspoken, and very bright. She majored in philosophy. As a student, she was deeply concerned with issues of discrimination and social justice, and also par- ticipated in anti-nuclear demonstrations. The subjects were then asked to rank order eight statements about Linda according to their probabilities. The statements included the following two: Linda is a bank teller. Linda is a bank teller and is active in the feminist movement. More than 80% of the subjects—graduate and medical school students with statistical training and doctoral students in the decision science program at Stanford University’s Business school—ranked the second statement as more probable than the first. This is clearly a simple logical error, because every bank teller feminist is also a bank teller. It appears, once again, that subject measure probability by representativeness and ignore baseline fre- quencies. I should note, however, that there is another interpretation according to which the subjects are correct in their judgment. Suppose you were given the above description of Linda, and were told you would be given a list of addresses and phone numbers from which you were to locate Linda as quickly as possible. Would you rather be given a list of bank tellers or a list of feminist bank tellers? The answer is probably the second. In other words, the probability that a randomly chosen bank teller is Linda is probably much lower than the probability that a randomly chosen feminist bank teller is Linda. A second heuristic is that in assessing the frequency of an event, peo- ple take excessive account of information that is easily available or highly salient, even though a selective bias is obviously involved. For this rea- son, people tend to overestimate the probability of rare events, because such events are highly newsworthy while non-occurrences are not reported. Thus, people worry much more about dying in an accident while flying than they do while driving, even though air travel is much safer than automobile travel. A third heuristic in problem solving is to start from an initial guess, cho- sen for its representativeness or salience, and adjust upward or downward toward a final figure. This is called anchoring because there is a tendency to underadjust, so the result is too close to the initial guess. Probably as a result of anchoring, people tend to overestimate the probability of con- Decision Theory and Human Behavior 31 junctions (p and q) and underestimate the probability of disjunctions (p or q). For an instance of the former, a person who knows an event occurs with 95% probability may overestimate the probability that the event occurs ten times in a row, suggesting a probability of 90%. The actual probability is about 60%. In this case the individual starts with 95% and does not adjust downward sufficiently. Similarly, if a daily event has a failure one time in a thousand, people will underestimate the probability that a failure occurs at least once in a year, suggesting a figure of 5%. The actual probability is 30.5%. Again, the individual starts with 0.1% and doesn’t adjust upward enough. A fourth heuristic is that people prefer objective probability distributions to subjective distributions derived from applying probabilistic principles, such as the Principle of Insufficient Reason, which says that if you are completely ignorant as to which of several outcomes will occur, you should treat them as equally probable. For example, if you give a subject a prize for drawing a red ball from an urn containing red and white balls, the sub- ject will pay to have the urn contain 50% red balls rather than contain an indeterminate percentage of red balls. This is the famous Ellsberg Paradox, analyzed above. Choice theorists often express dismay over the failure of people to apply the laws of probability and conform to normative decision theory. Yet, peo- ple may be applying rules that serve them well in daily life. It takes many years of study to feel at home with the laws of probability, the understand- ing of which is the product of the last couple of hundred years of scientific research. Moreover, it is costly, in terms of time and effort, to apply these laws even if we know them. Of course, if the stakes are high enough, it is worthwhile to go to the effort, or engage an expert who will do it for you. But generally, as Kahneman and Tversky suggest, we apply a set of heuristics that more or less get the job done. Among the most prominent heuristics is simply imitation: decide what class of phenomenon is involved, find out what people “normally do” in that situation, and do it. If there is some mechanism leading to the survival and growth of relatively successful behaviors, and if the problem in question recurs with sufficient regularity, the choice-theoretic solution will describe the winner of a dynamic social process of trial, error, and imitation. Should we expect people to conform to the axioms of choice theory— transitivity, independence from irrelevant alternatives, the sure-thing prin- 32 Chapter1 ciple, and the like? Where we know that individuals are really optimizing, and have expertise in decision theory, we doubtless should. But, this only applies to a highly restricted range of actions. In more general settings we should not. We might have recourse to Darwinian analysis, demonstrating that under the appropriate conditions individuals who are genetically consti- tuted to obey the axioms of choice theory will be better fit to solve general decision-theoretic problems, and hence will emerge triumphant through an evolutionary dynamic. But, human beings did not evolve facing general decision-theoretic problems. Rather, a few specific decision-theoretic prob- lems associated with survival in small social groups. We may have to settle for modeling these specific choice contexts to discover how our genetic con- stitution and cultural tools interact in determining choice under uncertainty. 2 Game Theory: Basic Concepts High-rationality solution concepts in game theory can emerge in a world populated by low-rationality agents. Young (1998) The philosophers kick up the dust and then complain that they cannot see. Bishop Berkeley This chapter defines and illustrates the fundamental tools of game the- ory: extensive form games, normal form games, and Nash equilibria. The reader is asked to take the importance of the Nash equilibrium concept on faith, for the analysis of when individuals may be expected to play a Nash equilibrium, involving epistemic game theory, developed in later chapters, and and evolutionary dynamics (Gintis 2009). The chapter begins, however, with a carefully developed example that showcases a major contribution of game theory to understanding social interactions: insisting on absolute clar- ity concerning the rules of the game and its informational assumptions. In the process, we touch on such engaging issues as credible and incredible threats and the value of precommitment, thus setting the tone for the next several chapters. 2.1 Big Johnand Little John Big John and Little John eat coconuts, which dangle from a lofty branch of a palm tree. Their favorite coconut palm produces only one fruit per tree. To get the coconut, at least one of them must climb the tree and knock the coconut loose so that it falls to the ground. Careful energy measurements show that a coconut is worth 10 Kc (kilocalories) of energy, the cost of running up the tree, shaking the coconut loose, and running back down to the ground costs 2 Kc for Big John, but is negligible for Little John, who is much smaller. Moreover, if both individuals climb the tree, shake the coconut loose, then climb down the tree and eat the coconut, Big John gets 7 Kc and Little John gets only 3 Kc, because Big John hogs most of it; if only Big John climbs the tree, while Little John waits on the ground for the 33 34 Chapter2 coconut to fall, Big John gets 6 Kc and Little John gets 4 Kc (Little John eats some before Big John gets back down from the tree); if only Little John climbs the tree, Big John gets 9 Kc and Little John gets 1 Kc (most of the food is gone by the time Little John gets there). What will Big John and Little John do if each wants to maximize net energy gain? There is one crucial issue that must be resolved: who decides first what to do, Big John or Little John? There are three possibilities: (a) Big John decides first; (b) Little John decides first; (c) both individuals decide simultaneously. We will go through the three cases in turn. Assuming Big John decides first, we get the situation depicted in Fig. 2.1. We call a figure like this a game tree, and we call the game it defines an extensive form game. At the top of the game tree is the root node (the little dot labeled “Big John”) with two branches, labeled w (wait) and c (climb). This means Big John gets to choose and can go either left (w) or right (c). This brings us to the two nodes labeled “Little John,” in each of which Little John can wait (w) or climb (c). Big John w c © LittleJohn¨ LittleJohn w c w c 0,0 9,1 4,4 5,3 Figure 2.1. Big John and Little John: Big John chooses first. While Big John has only two strategies, Little John actually has four: a. Climb no matter what Big John does (cc). b. Wait no matter what Big John does (ww). c. Do the same thing Big John does(wc). d. Do the opposite of what Big John does (cw). The first letter in parenthesis indicates Little John’s move if Big John waits, and the second is Little John’s move if Big John climbs. We call a move taken by a player at a node an action, and we call a series of actions that fully define the behavior of a player a strategy—actually a pure strategy, in contrast to “mixed” and “behavioral” strategies, which GameTheory 35 we will discuss later, that involve randomizing. Thus, Big John has two strategies, each of which is simply an action, while Little John has four strategies, each of which is two actions—one to be used when Little John goes left, and one when Little John goes right. At thebottom of the gametreeare fournodes, whichwe variously call leaf or terminal nodes. At each terminal node is the payoff to the two players, Big John (player 1) first and Little John (player 2) second, if they choose the strategies that take them to that particular leaf. You should check that the payoffs correspond to our description above. For instance, at the leftmost leaf when both wait, with neither John expending or ingesting energy, the payoff is (0,0). At the rightmost leaf both climb the tree, costing Big John 2 Kc, after which Big John gets 7 Kc and Little John gets 3 Kc. Their net payoffs are thus (5,3). And similarly for the other two leaves. How should Big John decide what to do? Clearly, Big John should figure out how Little John will react to each of Big John’s two choices, w and c. If Big John chooses w, then Little John will choose c, because this pays 1 Kc as opposed to 0 Kc. Thus, Big John gets 9 Kc by moving left. If Big John chooses c, Little John will choose w, because this pays 4 Kc as opposed to 3 Kc for choosing c. Thus Big John gets 4 Kc for choosing c, as opposed to 9 Kc for choosing w. We now have answered Big John’s problem: choose w. What about Little John? Clearly, Little John must choose c on the left node, but what should he choose on the right node? Of course it doesn’t really matter, because Little John will never be at the right node. However, we must specify not only what a player does “along the path of play” (in this case the left branch of the tree), but at all possible nodes on the game tree. This is because we can only say for sure that Big John is choosing a best response to Little John if we know what Little John does, and conversely. If Little John makes a wrong choice at the right node, in some games (though not this one) Big John would do better by playing c. In short, Little John must choose one of the four strategies listed above. Clearly, Little John should choose cw (do the opposite of Big John), because this maximizes Little John’s payoff no matter what Big John does. Conclusion: the only reasonable solution to this game is for Big John to wait on the ground, and Little John to do the opposite of what Big John does. Their payoffs are (9,1). We call this a Nash equilibrium (named after John Nash, who invented the concept in about 1950). A Nash equilibrium in a two-playergame is a pair of strategies, each of which is a best response 36 Chapter2 to the other; i.e., each gives the player using it the highest possible payoff, given the other player’s strategy. There is another way to depict this game, called its strategic form or nor- mal form. It is common to use both representations and to switch back and forth between them, according to convenience. The normal form cor- responding to Fig. 2.1 is in Fig. 2.2. In this example we array strategies of player 1 (Big John) in rows, and the strategies of player 2 (Little John) in columns. Each entry in the resulting matrix represents the payoffs to the two players if they choose the corresponding strategies. Little John cc cw wc ww w 9,1 9,1 0,0 0,0 Big John c 5,3 4,4 5,3 4,4 Figure 2.2. Normal form of Big John and LittleJohn when Big John moves first. We find a Nash equilibriumfromthe normalformof the game by trying to pick out a row and a column such that the payoff to their intersection is the highest possible for player 1 down the column, and the highest possible for player 2 across the row (there may be more than one such pair). Note that .w;cw/ is indeed a Nash equilibrium of the normal form game, because 9 is better than 4 for Big John down the cw column, and 1 is the best Little John can do across the w row. Can we find any other Nash equilibria to this game? Clearly .w;cc/ is also a Nash equilibrium, because w is a best reply to cc and conversely. But the .w;cc/ equilibrium has the drawback that if Big John should happen to make a mistake and play c, Little John gets only 3, whereas with cw, Little John gets 4. We say cc is weakly dominated by cw. But what if Little John plays ww? Then Big John should play c, and itis clear that ww is a best response to c. So this gives us another Nash equilib- rium, .c;ww/, in which Little John does much better, getting 4 instead of 1, and Big John does much worse, getting 4 instead of 9. Why did we not see this Nash equilibrium in our analysis of the extensive form game? The reason is that .c;ww/ involves Little John making an incredible threat. “I don’t care what you do, says Little John—I’m waiting here on the ground—no matter what.” The threat is “incredible” because Big John knows that if he plays w, then when it is Little John’s turn to carry out GameTheory 37 the threat to play w, Little John will not in fact do so, simply because 1 is better than 0.1 We say a Nash equilibrium of an extensive form game is subgame perfect if, at any point in the game tree, the play dictated by the Nash equilibrium remains a Nash equilibrium of the subgame. The strat- egy .c;ww/ is not subgame perfect because in the subgame beginning with Little John’s choice of w on the left of Fig. 2.1 is not a best response. Nice try, anyway, Little John! But what if Little John gets to choose first? Perhaps now Little John can force a better split than getting 1 compared to Big John’s 9. This is the extensive form game (Fig. 2.3). We now call Little John player 1 and Big John player 2. Now Big John has fourstrategies (the strategies that belonged to Little John in the previous version of the game) and Little John only has two (the ones that belonged to Big John before). Little John notices that Big John’s best response to w is c, and Big John’s best response to c is w. Because Little John gets 4 in the first case and only 1 in the second, Little John chooses w. Big John’s best choice is then cw, and the payoffs are (4,4). Note that by going first, Little John is able to precommit to a strategy that is an incredible threat when going second. Little John w c Big John Big John w c w c 0,0 4,4 1,9 3,5 Figure 2.3. Big John and Little John: Little John chooses first. The normal form for the case when Little John goes first is illustrated in Fig. 2.4. Again we find the two Nash equilibria (w;cc) and (w;cw), and again we find another Nash equilibrium not evident at first glance from the game tree: now it is Big John who has an incredible threat, by playing ww, to which Little John’s best response is c. The final possibility is that the players choose simultaneously or, equiv- alently, each player chooses an action without seeing what the other player 1This argument fails if the individuals can condition his behavior in one day on their behavior in previous days (see chapter 10). We assume the players cannot do this. 38 Chapter2 Big John cc cw wc ww w 4,4 4,4 0,0 0,0 Little John c 3,5 1,9 3,5 1,9 Figure 2.4. Normal form of Big John and LittleJohngame when Little John moves first. chooses. In this case, each player has two options: climb the tree (c), or wait on the ground (w). We then get the situation in Fig. 2.5. Note the new element in the game tree: the dotted line connecting the two places where Little John chooses. This is called an information set. Roughly speaking, an information set is a set of nodes at which (a) the same player chooses, and (b) the player choosing does not know which particular node represents the actual choice node. Note also that we could just as well interchange Big John and Little John in the diagram, reversing their payoffs at the termi- nal nodes, of course. This illustrates an important point: there may be more than one extensive form game representing the same real strategic situation. Big John w c Little John w c w c 0,0 9,1 4,4 5,3 Figure 2.5. Big John and Little John choose simultaneously. Even though there are fewer strategies in this game, it is hard to see what an equilibrium might be by looking at the game tree. This is because what Little John does cannot depend on which choice Big John makes, because Little John does not see Big John’s choice. So let’s look at the normal form game, in Fig. 2.6. From this figure, it is easy to see that both .w;c/ and .c;w/ are Nash equilibria, the first obviously favoring Big John and the second favoring Little John. In fact, there is a third Nash equilibrium that is more difficult to pick out. In this equilibrium Big John randomizes by choosing c and w with probability 1=2, and Little John does the same. This is called a mixed strategy Nash equilibrium; you will learn how to find and analyze it in 2.7. In this equilibrium Big John has payoff 4:5 and GameTheory 39 Little John has payoff 2. The reason for this meager total payoff is that with probability 1=4, both wait and get zero reward, and sometimes both climb the tree! Little John c w c 5,3 4,4 Big John w 9,1 0,0 Figure 2.6. Big John and Little John: normal form in the simultaneous move case. 2.2 The Extensive Form An extensive form game G consists of a number of players, a game tree, and a set of payoffs. A game tree consists of a number of nodes connected by branches. Each branch connects a head node to a distinct tail node If b is a branch of the game tree, we denote the head node of b by bh, and the tail node of b by bt . A path from node a to node a0 in the game tree is a sequence of branches starting at a and ending at a0.2 If there is a path from node a to a0, we say a is an ancestor of a0, and a0 is a successor to a. We call k the length of the path. If a path from a to a0 has length one, we call a the parent of a0, and a0 is a child of a. We require that the game tree have a unique node r, called the root node, that has no parent, and a set T of nodes called terminal nodes or leaf nodes, that have no children. We associate with each terminal node t 2 T (2 means “is an element of”), and each player i, a payoff i .t/ 2 R (R is the set of real numbers). We say the game is finite if it has a finite number of nodes. We assume all games are finite, unless otherwise stated. We also require that the graph of G have the following tree property. There must be exactly one path from the root node to any given terminal node in the game tree. Equivalently, every node except the root node has exactly one parent. Players relate to the game tree as follows. Each nonterminal node is assigned to a player who moves at that node. Each branch b with head node bh node represents a particular action that the player assigned to bh 2 h t h Technically, a path is a sequence b1;:::;bk of branches such that b1 D a, bi D biC1 t 0 for i D 1;:::k 1, and bk D a ; i.e., the path starts at a, the tail of each branch is the head of the next branch, and the path ends at a0. 40 Chapter2 that node can take there, and hence determines either a terminal node or the next point of play in the game—the particular child node bt to be visited next.3 If a stochastic event occurs at a node a (for instance, the weather is Good or Bad, or your partner is Nice or Nasty), we assign the fictitious player “Nature” to that node, the actions Nature takes representing the possible outcomes of the stochastic event, and we attach a probability to each branch of which a is the head node, representing the probabilitythat Nature chooses that branch (we assume all such probabilities are strictly positive). The tree property thus means that there is a unique sequence of moves by the players (including Nature) leading from the root node to any specific node of the game tree, and for any two nodes, there is at most one sequence of player moves leading from the first to the second. A player may know the exact node in the game tree when it is his turn to move (e.g., the first two cases in Big John and Little John, above), but he may know only that he is at one of several possible nodes. This is the situation Little John faces in the simultaneous choice case (Fig. 2.6). We call such a collection of nodes an information set. Fora set ofnodestoform an information set, the same player must be assigned to move at each of the nodes in the set and have the same array of possible actions at each node. We also require that if two nodes a and a0 are in the same information set for a player, the moves that player made up to a and a0 must be the same. This criterion is called perfect recall, because if a player never forgets his moves, he cannot make two different choices that subsequently land him in the same information set.4 Suppose each player i D 1;:::;n chooses strategy si . We call s D .s1;:::;sn/ a strategy profile for the game, and we define the payoff to player i, given strategy profile s, as follows. If there are no moves by Nature, then s determines a unique path through the game tree, and hence a unique terminal node t 2 T . The payoff i .s/ to player i under strategy profile s is then defined to be simply i .t/. 3 0 Thus if p D .b1;:::;bk/ is a path from a to a , then starting from a, if the actions 0 associated with the bj are taken by the various players, the game moves to a . 4 Another way to describe perfect recall is to note that the information sets Ni for player i are the nodes of a graph in which the children of an information set 2 Ni are 0 the 2 Ni that can be reached by one move of player i, plus some combination of moves of the other players and Nature. Perfect recall means that this graph has the tree property. GameTheory 41 Suppose there are moves by Nature, by which we mean that a one or more nodes in the game tree, there is a lottery over the various branches emanating from that node, rather than a player choosing at that node. For every terminal node t 2 T , there is a unique path pt in the game tree from the root node to t. We say pt is compatible with strategy profile s if, for h every branch b on pt , if player i moves at b (the head node of b), then si h chooses action b at b . If pt is not compatible with s, we write p.s;t/ D 0. If pt is compatible with s, we define p.s;t/ to be the product of all the probabilities associated with the nodes of pt at which Nature moves along pt , or 1 if Nature makes no moves along pt . We now define the payoff to player i as i .s/ D p.s;t/i .t/: (2.1) Xt2T Note that this is the expected payoff to player i given strategy profile s, assuming that Nature’s choices are independent, so that p.s;t/ is just the probability that path pt is followed, given strategy profile s. We generally assume in game theory that players attempt to maximize their expected payoffs, as defined in (2.1). N 0.6 0.4 1 1 L R L R 2 2 2 2 lr lr l r l r 4,4 0,55,0 1,1 6,4 0,00,0 4,6 Figure 2.7. Evaluating Payoffs when Nature Moves For example, consider the game depicted in Figure 2.7. Here, Nature moves first, and with probability pl D 0:6 goes left, where the game be- tween players 1 and 2 is known as the Prisoner’s Dilemma (2.11), and with probability pl D 0:4 goes right, where the game between players 1 and 2 is known as the Battle of the Sexes (2.9). Note that player 1 knows Nature’s move, because he has separate information sets on the two branches where Nature moves, but player 2 does not, because when he moves, he does not know whether he is on the left or right hand branch. If we write i .x;y;z/ 42 Chapter2 for the payoff to player i at the terminal node attained when player 1 plays x 2 fL; Rg, player 2 plays y 2 fL; Rg, and Nature plays z 2 fl; rg (i.e., moves left or right), then (2.1) gives, for instance 1.L;l/ D pl 1.L;l;l/ C pr1.L;l;r/ D 0:6.4/ C 0:4.6/ D 4:8 The reader should fill in the payoffs at the remaining seven nodes. 2.3 The Normal Form The strategic form or normal form game consists of a number of players, a set of strategies for each of the players, and a payoff function that associates a payoff to each player with a choice of strategies by each player. More formally, an n-player normal form game consists of a. A set of players i D 1;:::;n. b. A set Si of strategies for player i D 1;:::;n. We call s D .s1;:::;sn/, 5 where si 2 Si for i D 1;:::;n, a strategy profile for the game. c. A function i W S ! R for player i D 1;:::;n, where S is the set of strategy profiles, so i .s/ is player i’s payoff when strategy profile s is chosen. Two extensive form games are said to be equivalent if they correspond to the same normal form game, except perhaps for the labeling of the actions and the naming of the players. But given an extensive form game, how exactly do we form the corresponding normal form game? First, the play- ers in the normal form are the same as the players in the extensive form. Second, for each player i, let Si be the set of strategies of that player, each strategy consisting of a choice of an action at each information set where i moves. Finally, the payoff functions are given by equation (2.1). If there are only two players and a finite number of strategies, we can write the payoff function in the form of a matrix, as in Fig. 2.2. As an exercise, you should work out the normal form matrix for the game depicted in Figure 2.7. 5Technically, these are pure strategies, because later we will consider mixed strategies that are probabilistic combinations of pure strategies. GameTheory 43 2.4 Mixed Strategies Suppose a player has pure strategies s1;:::;sk in a normal form game. A mixed strategy for the player is a probability distribution over s1;:::;sk; i.e., a mixed strategy has the form D p1s1 C : : : C pksk; n where p1;:::pk are all nonnegative and 1 pj D 1. By this we mean that the player chooses sj with probability pj P, for j D 1;:::;k. We call pj the weight of sj in . If all the pj ’s are zero except one, say pl D 1, we say is a pure strategy, and we write D sl . We say that pure strategy sj is used in mixed strategy if pj >0. We say a strategy is strictly mixed if it is not pure, and we say that it is completely mixed if all pure strategies are used in it. We call the set of pure strategies used in a mixed strategy i the support of i . In an n-player normal form game where player i has pure strategy set Si for i D 1;:::;n, a mixed strategy profile D .1;:::;n/ is the choice ofa mixed strategy i by each player. We define the payoffs to as follows. Let i .s1;:::;sn/ be the payoff to player i when players use the pure strategy profile .s1;:::;sn/, and if s is a pure strategy for player i, let ps be the weight of s in i . Then we define i ./ D : : : ps1 ps2 :::psn i .s1;:::;sn/: sX12S1 sXn2Sn This is a formidable expression, but the idea behind it is simple. We assume the players’ choices are made independently, so the probability that the particular pure strategies s1 2 S1,. . . ,sn 2 Sn will be used is simply the product ps1 :::psn of their weights, and the payoff to player i in this case is just i .s1;:::;sn/. We get the expected payoff by multiplying and adding up over all n-tuples of mixed strategies. 2.5 Nash Equilibrium The concept of a Nash equilibrium of a game is formulated most easily in terms of the normal form. Suppose the game has n players, with strategy sets Si and payoff functions i WS !R, for i D 1;:::;n where S is the set of strategy profiles. We use the following very useful notation. Let Si be the set of mixed strategies for player i, and let mS D S1 : : : Sn, 44 Chapter2 the mixed strategy profiles for the game. If 2 mS, we write i for the ith component of (i.e., i is player i’s mixed strategy in ). If 2 mS, and i 2 Si , we write .1;2;:::;n/ if i D 1 . i ;i / D .i ; i / D 8 .1;:::;i 1;i ;iC1;:::;n/ if 1