Calibrated Learning and Correlated Equilibrium

University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 10-1997 Calibrated Learning and Correlated Equilibrium Dean P. Foster University of Pennsylvania Rakesh V. Vohra Follow this and additional works at: https://repository.upenn.edu/statistics_papers Part of the Behavioral Economics Commons, and the Statistics and Probability Commons Recommended Citation Foster, D. P., & Vohra, R. V. (1997). Calibrated Learning and Correlated Equilibrium. Games and Economic Behavior, 21 (1-2), 40-55. http://dx.doi.org/10.1006/game.1997.0595 This paper is posted at ScholarlyCommons. https://repository.upenn.edu/statistics_papers/589 For more information, please contact [email protected]. Calibrated Learning and Correlated Equilibrium Abstract Suppose two players repeatedly meet each other to play a game where: 1. each uses a learning rule with the property that it is a calibrated forecast of the other's plays, and 2. each plays a myopic best response to this forecast distribution. Then, the limit points of the sequence of plays are correlated equilibria. In fact, for each correlated equilibrium there is some calibrated learning rule that the players can use which results in their playing this correlated equilibrium in the limit. Thus, the statistical concept of a calibration is strongly related to the game theoretic concept of correlated equilibrium. Disciplines Behavioral Economics | Statistics and Probability This journal article is available at ScholarlyCommons: https://repository.upenn.edu/statistics_papers/589 Calibrated Learning and Correlated Equilibrium Dean P Foster Rakesh V Vohra University of Pennsylvania Ohio State University y Philadelphia PA Columbus OH First draft May Revised June This version Octob er Emailfosterhellsparkwhartonup ennedu y Emailvohraosuedu Abstract Supp ose two players meet each other in a rep eated game where each uses a learning rule with the prop erty that it is a calibrated forecast of the others plays and each plays a b est resp onse to this forecast distribution Then the limit p oint of the sequence of plays are Correlated Equilib ria In fact for each Correlated equilibrium there is some calibrated learning rule that the players can use which result in their playing this correlated equilibrium in the limit Thus the statistical concept of cal ibration is strongly related to the game theoretic concept of correlated equilibrium Intro duction The concept of a Nash Equilibrium NE is so imp ortant to game theory that an extensive literature devoted to its defense and advancement exists Even so there are asp ects of the Nash equilibrium concept that are puzzling One is why any player should assume that the other will play their Nash equilibrium strategy Aumann says This is particularly p erplexing when as often happ ens there are multiple equilibria but it has considerable force even when the equilibrium is unique One resolution is to argue that the assumption ab out an opp onents plays are the outcome of some learning pro cess see for example Chapter of Kreps a Learning is mo deled as recurrent up dating Players cho ose a b est reply on the basis of their forecasts of their opp onents future choices Fore casts are describ ed as a function of previous plays in the rep eated game Much attention has fo cused on developing forecast rules by which a Nash equilibrium or its renements may b e learned Many rules have b een pro p osed and convergence to Nash equilibrium has b een established under cer tain conditions see Skyrms For example Fudenb erg and Kreps intro duce the class of rules satisfying a prop erty called asymptotic myopic bayes They prove that if convergence takes place it do es so to a NE No tice that convergence is not guaranteed In summarizing other approaches Kreps b p oints out in general convergence is not assured This lack of convergence serves to lessen the imp ortance of NE and its renements On the p ositive side Milgrom and Rob erts have shown that any learning rule that requires the player to make approximately b est resp onses consistent with their exp ectations play tends towards the serially undomi nated set of strategies They call such learning rules adaptive and prove that if the sequence of plays converges to a NE or correlated equilibrium then each players play is consistent with adaptive learning Learning as we have describ ed it takes place at the level of the indi vidual An imp ortant class of learning mo dels involve learning at the level of p opulations evolutionary mo dels Here the dierent strategies are rep resented by individuals in the p opulation In particular a mixed strategy would b e represented by assigning an appropriate fraction of the p opulation to each strategy A pair of individuals is selected at random to play the game Individuals do not up date their strategies but their numb ers wax and wane according to their average suitably dened payo Even in this environ ment convergence to a NE is not guaranteed On the p ositive side results analogous to Milgrom and Rob erts have b een obtained by Samuelson and Zheng A second ob jection to NE is that it is inconsistent with the Bayesian p er sp ective A Bayesian player starts with a prior over what their opp onent will select and cho oses a b est resp onse to that To argue that Bayesians should play the NE of the game is to insist that they each cho ose a particular prior Aumann has gone further and argued that the solution concept con sistent with the Bayesian p ersp ective is not NE but Correlated Equilibrium CE Supp ort for such a view can b e found in Nau and McCardle who characterize CE in terms of the no arbitrage condition so b eloved by Bayesians Also Kalai and Lehrer show that Bayesian players with uncontradicted b eliefs learn a correlated equilibrium In this note we provide a direct link b etween the Bayesian b eliefs of players to the conclusion that they will play a CE We do this by showing that a CE can b e learned We do not particular a sp ecic learning rule rather we restrict our attention to learning rules that p ossess an asymptotic prop erty called calibration The key result is that if players use any forecasting rule with the prop erty of b eing calibrated then in rep eated plays of the game the limit p oints of the sequence of plays are correlated equilibria The game theoretic imp ortance of calibration follows from a theorem of Dawid Given the Bayesians prior lo ok at the forecasts generated by the p osterior The sequences of future events on which this forecast will not b e calibrated have measure zero That is the Bayesians prior assigns prob ability zero to such outcomes Thus under the common prior assumption a bayesian would exp ect all the other players to b e using their p osterior and hence to b e calibrated Now using our result that calibration implies cor related equilibria and the common prior assumption shows that bayesians exp ect that in the limit they will b e playing a correlated equilibrium This provides an alternative prove to Aumanns pro of that the common prior as sumption and rationality implies a correlated equilibrium If the common prior assumption holds then it is common knowledge that all players are cal ibrated If the players use a Bayesian forecasting scheme that is calibrated then by the ab ove in rep eated plays of the game the limit p oints of the sequence of plays are correlated equilibria In the next section of this pap er we intro duce notation and provide a rig orous denition of some of the terms used in the intro duction Subsequently we state and prove the main result of our pap er For ease of exp osition we consider only the p erson case However our results generalize easily to the np erson case 1 See discussion after Theorem Notation and Denitions For i denote by S i the nite set of pure strategies of player i and by u x y the payo to player i where x S and y S Let i m jS j and n jS j A correlated strategy is a function h from a nite probability space into S S ie h h h is a random variable whose values are pairs of strategies one from S and the other from S Note that if h is a correlated strategy then u h h is a real valued random i variable So as to understand the denition of a correlated equilibrium imagine an umpire who announces to b oth players what and h are Chance cho oses an element g and hands it to the umpire who computes hg The umpire then reveals h g to player i only and nothing more i Denition A correlated strategy h is cal led a correlated equilibrium if E u h h E u h h for al l S S and E u h h E u h h for al l S S Thus a CE is achieved when no player can gain by deviating from the umpires recommendation assuming the other player will not deviate either The deviations are restricted to b e functions of h b ecause player i knows i only h g For more on CE see Aumann and Aumann i We turn now to the notion of calibration This is one of a numb er of criteria used to evaluate the reliability of a probability forecast It has b een argued by a numb er of writers see Dawid that calibration is an app ealing minimal condition that any resp ectable probability forecast should satisfy Dawid oers the following intuitive denition Supp ose that in a long conceptually innite sequence of weather forecasts we lo ok at all those days for which the forecast prob ability of precipitation was say close to some given value p and assuming these form an innite sequence determine the long run prop ortion of such days on which the

Calibrated Learning and Correlated Equilibrium

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support