University of Pennsylvania ScholarlyCommons

Statistics Papers Wharton Faculty Research

10-1997

Calibrated Learning and Correlated Equilibrium

Dean P. Foster University of Pennsylvania

Rakesh V. Vohra

Follow this and additional works at: https://repository.upenn.edu/statistics_papers

Part of the Behavioral Economics Commons, and the Statistics and Probability Commons

Recommended Citation Foster, D. P., & Vohra, R. V. (1997). Calibrated Learning and Correlated Equilibrium. Games and Economic Behavior, 21 (1-2), 40-55. http://dx.doi.org/10.1006/game.1997.0595

This paper is posted at ScholarlyCommons. https://repository.upenn.edu/statistics_papers/589 For more information, please contact [email protected]. Calibrated Learning and Correlated Equilibrium

Abstract Suppose two players repeatedly meet each other to play a game where:

1. each uses a learning rule with the property that it is a calibrated forecast of the other's plays, and

2. each plays a myopic to this forecast distribution.

Then, the limit points of the sequence of plays are correlated equilibria. In fact, for each correlated equilibrium there is some calibrated learning rule that the players can use which results in their playing this correlated equilibrium in the limit. Thus, the statistical concept of a calibration is strongly related to the game theoretic concept of correlated equilibrium.

Disciplines Behavioral Economics | Statistics and Probability

This journal article is available at ScholarlyCommons: https://repository.upenn.edu/statistics_papers/589

Calibrated Learning and Correlated

Equilibrium

Dean P Foster Rakesh V Vohra

University of Pennsylvania Ohio State University

y

Philadelphia PA Columbus OH

First draft May Revised June

This version Octob er

Emailfosterhellsparkwhartonup ennedu

y

Emailvohraosuedu

Abstract

Supp ose two players meet each other in a rep eated game where

each uses a learning rule with the prop erty that it is a calibrated

forecast of the others plays and

each plays a b est resp onse to this forecast distribution

Then the limit p oint of the sequence of plays are Correlated Equilib

ria In fact for each Correlated equilibrium there is some calibrated

learning rule that the players can use which result in their playing this

correlated equilibrium in the limit Thus the statistical concept of cal

ibration is strongly related to the game theoretic concept of correlated

equilibrium

Intro duction

The concept of a NE is so imp ortant to

that an extensive literature devoted to its defense and advancement exists

Even so there are asp ects of the Nash equilibrium concept that are puzzling

One is why any player should assume that the other will play their Nash

equilibrium Aumann says This is particularly p erplexing

when as often happ ens there are multiple equilibria but it has considerable

force even when the equilibrium is unique

One resolution is to argue that the assumption ab out an opp onents plays

are the of some learning pro cess see for example Chapter of Kreps

a Learning is mo deled as recurrent up dating Players cho ose a b est

reply on the basis of their forecasts of their opp onents future choices Fore

casts are describ ed as a function of previous plays in the rep eated game

Much attention has fo cused on developing forecast rules by which a Nash

equilibrium or its renements may b e learned Many rules have b een pro

p osed and convergence to Nash equilibrium has b een established under cer

tain conditions see Skyrms For example Fudenb erg and Kreps

intro duce the class of rules satisfying a prop erty called asymptotic myopic

bayes They prove that if convergence takes place it do es so to a NE No

tice that convergence is not guaranteed In summarizing other approaches

Kreps b p oints out in general convergence is not assured This lack

of convergence serves to lessen the imp ortance of NE and its renements

On the p ositive side Milgrom and Rob erts have shown that any

learning rule that requires the player to make approximately b est resp onses

consistent with their exp ectations play tends towards the serially undomi

nated set of strategies They call such learning rules adaptive and prove that

if the sequence of plays converges to a NE or correlated equilibrium then

each players play is consistent with adaptive learning

Learning as we have describ ed it takes place at the level of the indi

vidual An imp ortant class of learning mo dels involve learning at the level

of p opulations evolutionary mo dels Here the dierent strategies are rep

resented by individuals in the p opulation In particular a mixed strategy

would b e represented by assigning an appropriate fraction of the p opulation

to each strategy A pair of individuals is selected at random to play the game

Individuals do not up date their strategies but their numb ers wax and wane

according to their average suitably dened payo Even in this environ

ment convergence to a NE is not guaranteed On the p ositive side results

analogous to Milgrom and Rob erts have b een obtained by Samuelson and

Zheng

A second ob jection to NE is that it is inconsistent with the Bayesian p er

sp ective A Bayesian player starts with a prior over what their opp onent will

select and cho oses a b est resp onse to that To argue that Bayesians should

play the NE of the game is to insist that they each cho ose a particular prior

Aumann has gone further and argued that the con

sistent with the Bayesian p ersp ective is not NE but Correlated Equilibrium

CE Supp ort for such a view can b e found in Nau and McCardle

who characterize CE in terms of the no arbitrage condition so b eloved by

Bayesians Also Kalai and Lehrer show that Bayesian players with

uncontradicted b eliefs learn a correlated equilibrium

In this note we provide a direct link b etween the Bayesian b eliefs of

players to the conclusion that they will play a CE We do this by showing that

a CE can b e learned We do not particular a sp ecic learning rule rather we

restrict our attention to learning rules that p ossess an asymptotic prop erty

called calibration The key result is that if players use any forecasting rule

with the prop erty of b eing calibrated then in rep eated plays of the game

the limit p oints of the sequence of plays are correlated equilibria

The game theoretic imp ortance of calibration follows from a theorem of

Dawid Given the Bayesians prior lo ok at the forecasts generated by

the p osterior The sequences of future events on which this forecast will not

b e calibrated have measure zero That is the Bayesians prior assigns prob

ability zero to such outcomes Thus under the common prior assumption a

bayesian would exp ect all the other players to b e using their p osterior and

hence to b e calibrated Now using our result that calibration implies cor

related equilibria and the common prior assumption shows that bayesians

exp ect that in the limit they will b e playing a correlated equilibrium This

provides an alternative prove to Aumanns pro of that the common prior as

sumption and rationality implies a correlated equilibrium If the common

prior assumption holds then it is that all players are cal

ibrated If the players use a Bayesian forecasting scheme that is calibrated

then by the ab ove in rep eated plays of the game the limit p oints of the

sequence of plays are correlated equilibria

In the next section of this pap er we intro duce notation and provide a rig

orous denition of some of the terms used in the intro duction Subsequently

we state and prove the main result of our pap er For ease of exp osition we

consider only the p erson case However our results generalize easily to the

np erson case

1

See discussion after Theorem

Notation and Denitions

For i denote by S i the nite set of pure strategies of player i and

by u x y the payo to player i where x S and y S Let

i

m jS j and n jS j A correlated strategy is a function h from a nite

probability space into S S ie h h h is a random variable

whose values are pairs of strategies one from S and the other from S

Note that if h is a correlated strategy then u h h is a real valued random

i

variable

So as to understand the denition of a correlated equilibrium imagine an

umpire who announces to b oth players what and h are Chance cho oses an

element g and hands it to the umpire who computes hg The umpire

then reveals h g to player i only and nothing more

i

Denition A correlated strategy h is cal led a correlated equilibrium if

E u h h E u h h for al l S S

and

E u h h E u h h for al l S S

Thus a CE is achieved when no player can gain by deviating from the

umpires recommendation assuming the other player will not deviate either

The deviations are restricted to b e functions of h b ecause player i knows

i

only h g For more on CE see Aumann and Aumann

i

We turn now to the notion of calibration This is one of a numb er of

criteria used to evaluate the reliability of a probability forecast It has b een

argued by a numb er of writers see Dawid that calibration is an

app ealing minimal condition that any resp ectable probability forecast should

satisfy Dawid oers the following intuitive denition

Supp ose that in a long conceptually innite sequence of weather

forecasts we lo ok at all those days for which the forecast prob

ability of precipitation was say close to some given value p and

assuming these form an innite sequence determine the long

run prop ortion of such days on which the forecast event rain

in fact o ccurred The plot of against p is termed the forecasters

empirical calibration curve If the curve is the diagonal p

the forecaster may b e termed wel l calibrated

To give the notion a formal denition supp ose that player is using

a forecasting scheme f The output of f in round t of play is an ntuple

f t fp t p tg where p t is the forecasted probability that player

n j

will play strategy j S at time t Let j t if player plays their

j th strategy in round t and zero otherwise Denote by N p t the numb er

of rounds up to the tth round that f generated a vector of forecasts equal

to p Let p j t b e the fraction of these rounds for which player plays j

ie

t

X

I j s

f sp

p j t

N p t

s

if N p t and zero otherwise

The forecast f is said to calibrated with resp ect to the sequences of plays

made by player if

X

N p t

jp j t p j lim

j

t!1

t

p

2

Dawid page His notation has b een changed to match ours

for all j S Notice that taking is now seen not to matter since

the only time it will o ccur is if N p t and thus it would b e multiplied

by zero anyway Roughly calibration says that the empirical frequencies

conditioned on the assessments converge to the assessments This is to b e

contrasted with the asymptotic myopic bayes condition of Fudenb erg and

Kreps which says that the empirical frequencies in round t converge together

with the assessments in round t

Calibration and Correlated Equilibrium

It is clear from the denition of correlated strategies that a CE is simply

a joint distribution over S S with a particular prop erty Hence we

fo cus on D x y the fraction of times up to time t that player plays x

t

and player plays y This is the empirical joint distribution We assume

that when players select their b est resp onse for a given forecast they use

a a stationary and deterministic tie breaking rule say the lowest indexed

strategy

Theorem Let be the set of al l correlated equilibria If each player uses

a forecast that is calibrated against the others sequence of plays and then

makes a best response to this forecast then

min max jD x y D x y j

t

D 2

x2S y 2S

as t the number of rounds of play tends to innity

PROOF Observe rst that the nmtuple each of whose comp onents is

of the form D x y lies in the nm dimensional unit simplex By the

t

BolzanoWeirstrass theorem any b ounded sequence in it contains a conver

gent subsequence Thus for any subsequence fD x y g and D x y such

t

i

that

X

jD x y D x y j

t

i

x2S (1)

y 2S (2)

we need to show that D is a CE

For each x S let M x b e the set of mixtures over S for which x

b

is a b est resp onse M x is a closed convex subset of the n dimensional

b

simplex Let M x b e the set of mixtures where player actually plays x

p

given that the forecast is in M x By the assumption that players cho ose

p

b est resp onses M x M x Further fM x x S g forms a

p b p

partition of the simplex The empirical conditional distribution of y S

xy D

t D xy

i

P P

given that player played x is This converges to

xc D D xc

t

i

c2S (2) c2S (2)

P

as long as D x c do es not converge to zero If it did it would mean

t

c2S

i

that the prop ortion of times that x is played tends to zero Hence in the

limit player never plays x so it can b e ignored To complete the pro of

D xy

P

is it suces to show that the ntuple whose y th comp onent is

D xc

c2S (2)

contained in M x Observe that

b

X

D x y t y r

t

i

i

r t f r 2M x

p

i

X X

t y r

i

p2M x r t f r p

p

i

X

t p y t N p t

i i

i

p2M x

p

X

t p N p t

y i

i

p2M x

p

X

t p y t p N p t

i y i

i

p2M x

p

Since the forecasts b eing used are calibrated the second term in the last

expression go es to zero as t tends to innity Note

X

N p t

i

M x p

P

b y

N q t

i

p2M x

p

q 2M x

p

b ecause it is a convex combination of vectors in M x f recall M M g

b p b

and M x is convex Therefore

b

X

D x y N p t

X

lim p

P

y

t !1

i

D x c

N p t

c2S

p2M x

p

p2M x

p

th

which is then the y comp onent of a vector in M x also

b

x y g contains a convergent sub We have shown that any sequence fD

t

i

sequence whose limit is a CE The theorem now follows 2

In some sense the result ab ove is not surprising We know from Milgrom

and Rob erts if players use b est resp onses they eliminate dominated

strategies Secondly the calibration requirement forces limit p oints to satisfy

an additional equilibrium requirement Correlation arises b ecause players are

able to condition on previous plays

It is natural to ask if Theorem would hold with a nonstationary tie

breaking rule The following version of shows that this is

not p ossible In each round the row player will forecast that there is a

Matching Pennies

h t

H n n

T n n

chance that column will play heads and a chance that column will play

tails ie is the forecast The column player will do likewise Given

these forecasts there is a tie for the b est reply Consider the following tie

breaking rule on even numb ered rounds play heads and tails on the other

rounds Notice that the resulting sequence of plays will b e Tt Hh Tt Hh

Clearly the forecasts of each player are calibrated but the distribution

of plays do es not converge to a CE

Theorem raises the question of how a calibrated forecast is to b e pro

duced Oakes has shown that there is no deterministic forecast that is

calibrated for al l possible sequences of outcomes Our requirements are more

mo dest Given a game and a correlated equilibrium of this game is there

a sequence of plays and a deterministic forecasting rule dep ending only on

observed histories that is calibrated The next theorem provides a p ositive

answer to this question

Denition Cal l a point of the distribution D x y a limit p oint of cal

ibrated forecasts if there exist deterministic best reply functions R and

i

calibrated forecasting rules p such that if each player i plays R p then the

i i i

limiting joint distribution wil l be D x y

Denition Let be the set of al l distributions which are limit points of

calibrated forecasts

Using this notation we can restate Theorem as saying that

mn

We can represent every game by a vector in where each comp onent

corresp onds to a players payo A set of games is of measure zero if the

mn

corresp onding set of p oints in has Leb esgue measure zero

Theorem For almost every game G G In other words for al

most every game the set of distributions which calibrated learning rules can

converge to is identical to the set of correlated equilibriums

Pro of Because of Theorem we need only prove that Let

x y b e a deterministic computable sequence such that the limiting joint

j j

distribution is D x y At time j have player forecast

X

p D x D x y

j j j

y 2S

and player two forecast

X

p D y D x y

j j j

x2S

By the assumption that the joint distribution converges to D x y it is clear

that b oth of these forecasts are calibrated Further x is in fact a b est

j

resp onse to the forecast p and y to p So dene R p such that

j j j

for all j x R p and similarly for R p These forecasts and these

j j

b est reply functions are the key idea of the pro of In fact in the situation

where R and R are b oth well dened we have completed the pro of

But R and R might not b e well dened In other words there

00 0 0 00

00 0

x x and x and x such that x might b e two dierent strategies x

j j



00 0

p This is where the almost every game condition p then p

j j

comes into play

Almost every game has the prop erty that all the sets M x have non

b

empty interior To see why this is the case observe that M x is formed by

b

the intersection of halfspaces Start with a closed convex set with nonempty

interior C say and add these halfspaces one at a time We can cho ose C

to b e the simplex of all mixed strategies Consider a halfspace H chosen at

random such that the co ecients that dene H are continuous with resp ect

to leb esgue measure We claim that the intersection of C and H is either the

empty set or a set with an op en interior

Pick a p oint p in the interior of C Let q b e the p oint in the b oundary of H

which is closest to p Let v b e the ray from p to q and d its length Both v and

d have continuous distributions since they are a continuous transformation of

the halfspace H Now consider distribution of d conditional on v Given v

there is a unique d such that H will b e tangent to C and not contain C The

conditional probability of d taking this value is Hence the unconditional

probability is zero also

The interiors of the sets of the form M x are disjoint Thus near the

b

00 0

x  x

such that the unique b est resp onse to and p p oint p there are p oints p

00 0 00 0

x 00 x 0 x x

or p is x Forecasting p is x and the unique b est resp onse to p p



instead of p makes the reply function well dened Unfortunately when the

0

x 

forecast of p is made the actual frequency will turn out to b e p Thus

0 0

 x x

to p j If we can cho ose p the calibration score will b e o by at most jp



b e convergent to p solves this last problem and our pro of is complete

0 0 0

x  x x

Dene a sequence p ip ip Then p converges to

i i

0 0

 x 0 x

p and for all i p has x as its unique b est reply For each i forecast p

i i

suciently many times to ensure that there is a high probability that the



empirical distribution is within i of p With high probability the empirical

0 0

x x

frequency conditional on forecast p will b e within i of p and hence the

i i

calibration score will converge to zero 2

3

The interiors and the union of the b oundaries would form a partition

To see why theorem only holds for almost every game and not every

game consider the following game

Example of

A n n n

B n n n

n n n C

If ROW randomizes b etween A and B with equal probability and COL

plays then this is a Correlated Equilibrium with a payo of But the

only p oint in is the distribution which puts all its weight on p oint C

which yields a payo of This is b ecause M A M B f g

b b

and M c is the entire simplex So if R A then ROW will

b

ROW

never play strategy B and likewise if R B then ROW will

ROW

never play A So a mixture of A and B is imp ossible and thus the payo

is imp ossible Thus

Can Theorem b e strengthened such that convergence to Nash Equi

librium is assured instead of to a CE The previous theorem shows if one

assumes only calibration one gets any CE in So without further assump

tions on the forecasting rule convergence to Nash cannot b e assured In

particular by adding an assumption that the limit exists do es not rene the

equilibrium attained in contrast with Fudenb erg and Kreps who show that

if a limit exists it must b e Nash This is b ecause Theorem do es not just

nd an accumulation p oint it nds a direct limit

Is it easy to construct a forecast that is calibrated Given the imp ossibil

ity theorem of Oakes the existence of a deterministic scheme that is

calibrated for al l sequences is ruled out However a randomized forecasting

scheme is p ossible

Theorem Foster and Vohra There exists a randomized fore

cast that player can use such that no matter what learning rule player

0

uses player wil l be calibrated That is to say player s calibration score

X X

N p t

C jp j t p j

t j

t

p

j 2S

converges to zero in probability In other words for al l we have that

lim P C

t

t!1

Pro of See the app endix

The imp ortant thing to notice ab out Theorem is that each player can

individually cho ose to b e calibrated The other player can not foil this choice

Player do es not have to assume that player is using an exchangeable

sequence nor that the player is rational Player is still calibrated if

player plays any arbitrary sequence Secondly the pro of is constructive

ie there is an explicit algorithm for pro ducing such a forecast To extend

this result to the np erson case the forecasting rules must predict the joint

distribution of what everyone else will play

If in Theorem we require only that the players use a forecasting rule

that is close to calibrated in the sense of Theorem we obtain

Corollary There exists a randomized forecasting scheme such that if both

player and player fol low this scheme then FOR ANY normal form matrix

game and for al l there exists a t such that for al l t t

P min max jD x y D x y j

t

xy

D 2

4

The most involved step is inverting a matrix

In other words D converges in probability to the set under the Hausdor

t

topology

The Shapley Game and

The most famous of learning rules for games is called Fictitious Play FP

rst conceived in by George Brown In a two p erson game it go es as

follows

Denition Denition of Fictitious Play Row computes the proportion of

times up to the present that Column has played each of hisher strategies

Then Row treats these proportions as the probabilities that Column wil l select

from among hisher strategies Row then selects the strategy that is hisher

best response Column does likewise

In Julia Robinson proved that FP converges to a NE in p erson

zero sum games

After the Robinson pap er interest naturally turned to trying to generalize

Robinsons theorem to nonzero sum games In K Miyasawa proved

that FP converges to a NE in p erson nonzero sum games where each player

has at most two strategies However in dashed hop es

of a generalization by describing a nonzero sum game consisting of three

strategies for each player in which FP did not converge to a NE In this

section we show that FP do esnt converge to a Correlated Equilibrium We

use Shapleys original example

5

See Monderer and Shapley for other situations in which FP converges

Payo Matrix for Shapley Game

n n n

n n n

n n n

As observed by Shapley FP in this game will oscillate b etween states

then then then rep eat Fictitious play

stays longer and longer in each state so the p erio ds of oscillation get larger

and larger There is only one Correlated equilibrium with supp ort on these

six states It assigns probability to each state Fictitious play is never

close to this distribution Thus it do es not converge to a CE

6

Using Nau and McCardle the following linear program pro duces all the CE

p p p p p p p

11 12 22 23 33 31 11

p p p p

13 11 13 23

p p p p

21 22 21 31

p p p p

32 33 32 12

Which is equivalent to the LP p p p p p p p p p p

11 12 22 23 33 31 11 13 11 21

p p p Adding the constraint that p p p this LP has a unique

11 32 11 13 21 32

solution of p p p p p p

11 12 22 23 33 31

7

This can b e see either by direct calculation or by the following trick If Fictitious play

was ever close to this CE then the marginals would have to b e close to

But these marginals corresp ond to the Nash Equilibrium Shapley created this example

precisely to show that the marginals didnt converge to the marginals of the Nash equi

librium in fact the marginals are b ounded away from the p oint Thus the

Nash equilibrium is not an accumulation p oint of the sequence of plays Thus we know

that the marginals are never close to b eing correct and thus the joint distribution is also

never close

The Shapley game is interesting b ecause it has a CE which is not a

mixture of Nash Equilibriums Theorem tells us that there are calibrated

learning rules which will then converge to this CE The exp ected payo is

which Pareto dominates the Nash payo of

Postscript

Earlier versions of this pap er as well as presentations of the results at various

conferences have generated a deal of follow on pap ers on calibration and its

connections to game theory In this section we give a brief description of

some of this work

Theorem which establishes the existence of randomized forecasting

scheme that is calibrated has prompted a numb er of alternative pro ofs The

rst of these was due to Sergiu Hart p ersonal communication and is par

ticularly simple and short It makes use of the minimax theorem The draw

back is that the scheme implied by the metho d is impractical to implement

Indep endently Fudenb erg and Levine also gave a pro of using the min

imax theorem The approach is more elab orate than Harts but pro duces a

forecasting scheme that is practical to implement In a follow up pap er Fu

denb erg and Levine consider a renement of the calibration idea that

involves the classication of observations into various categories For this

renement they derive a pro cedure that yields almost as high a timeaverage

payo as could b e obtained if the player cho oses knowing the conditional

distribution of actions given categories If players use such a pro cedure long

run the time average play resemb eles a correlated equilibrium

8

The unique Nash Equilibrium for this game is vs So

any CE which isnt Nash is also not a mixture of Nash Equilibriums

Our own pro of of Theorem which is describ ed in the app endix is based

on establishing the existence of a forecasting scheme that has a prop erty

called noregret An pro of along the lines of Theorem shows that a no

regret pro cedure would also lead to a correlated equilibrium Hart and Mas

Collel have extended this idea in many ways First they proved a

very elegant pro of of noregret based on Blackwells vector minimax

theorem Second they mo dify this scheme which requires a matrix inversion

to one that involves regretmatching This greatly reduces the computations

required to implement the pro cedure The simplied pro cedure no longer

has the noregret prop erty but it will converge to a correlated equilibrium

Their theorem is much harder to prove since they cant simply app eal to a

noregretcalibration prop erty as we have done

Kalai Lehrer and Smoro dinsky have recently shown that the no

tion of calibration is mathematically equivalent to that of merging This

allows one to establish relationships b etween convergence results based on

merging and those based on calibration and so derive some new convergence

results

App endix

This app endix provides a telegraphic pro of of Theorem For more details

see Foster and Vohra

We will rst prove a prop erty called noregret Consider k forecasts

i

each with a loss or p enalty at time t of L for i k Now con

t

sider a randomized forecast which picks forecast i at time t with probability

i

w We dene the loss from using the combined forecast to b e the weighted

t

P

k

i i

w sum of the losses of each forecast namely L

i t t

Denition The regret generated by changing al l i forecasts to j forecasts

i!j ij ij

is R maxf S g S I ij where I is the indicator function and

x

T T T

S

T

T

X

j ij

i i

w L L S

t

t t T

t

We cho ose the probability vector w so that it satises the following ow

t

conservation equations

k k

X X

j j !i i!j

i

w R R i w

t

t t

t

j j

The duality theorem of linear programming can b e used to establish the

P

k

i

w existence of a nonnegative solution w to this system such that

t

t i

 

Lemma Noregret For al l i and j the regret grows as the squareroot

p

 

i !j

of T In particular R k T

T

x

G x we see that Pro of Let G x I Since x

x

X

   

i !j ij i j

R G S G S

T T T

ij

0

x xI and so Now G

x

X X

j ij ij ij ij

i i 0

ij

w L L S I S S G S

t t

t t t

t t

S

t1

ij ij

X X X

j !i j i!j

i i

R w R L w

t

t t

t t

j j i

z

by ow conservation

ij ij

Expanding G S as a two term Taylor series around S shows

t

t

X X X X

j ij ij ij ij ij

i i 0

w L L S S G S G S G S

t t t

t t t

t t

ij ij ij ij

X X

ij

i

w G S k

t

t

i ij

X

ij

k G S

t

ij

 

P

ij i !j

Computing the recursive sum we see that G S T k and so R

ij

T T

p p

 

i !j

T k Picking k T shows R T k 2

T

We will now show that for a suitable loss function a randomized forecast

that has no regret must also b e calibrated

First our forecasting scheme will cho ose in each round a probability

i

vector from the set fp ji k g which is chosen so that any

probability distribution over S the opp onents strategies is within

of one of these p oints

We denote the move made by player in by the vector X X X X

t t t t

where X if strategy j S was chosen and zero otherwise No

tj

tice that X will b e a vector with exactly on nonzero comp onent

t

i i

Next the loss incurred in round t from forecasting p will b e L

t

P

i i

jX p j jX p j

t tj

j 2S

j

i i

The probability of forecasting p at time t will b e w

t

We would like to cho ose the w s so that L calibration C t go es to zero

t

in probability as t gets large where

X

N p t

C t p j t p

j

t

p

The exp ected value of C t is given by

k t

X X X

i i i

w p j s p s E C t

t

t j

i s

j 2S

Simple algebra yields

X X

i!j i!j

max R t max R t E C t

t t

j j

i i

If the probabilities w s are chosen to satisfy the ow conservation equations

t

displayed earlier we deduce that

k

p

E C t O

t

Thus if we let k grow slowly and go slowly to zero we see that C t

in exp ectation which implies C t in probability by Jensens inequality

The L calibration denition of equation follows from the fact that it

is smaller than the square ro ot of the L calibration Thus we have proved

Theorem

REFERENCES

Aumann R J Sub jectivity and Correlation in Randomized Strategies

Journal of Mathematical Economics

Aumann R J Correlated Equilibrium as an Expression of Bayes Ra

tionality Econometrica

Blackwell D A vector valued analog of the minimax theorem Pacic

J Math

Dawid A P The Well Calibrated Bayesian Journal of the American

Statistical Association

Foster D P and R Vohra Asymptotic Calibration unpublished manuscript

Foster D P and H P Young Sto chastic Evolutionary Game Dynam

ics Theoretical Population Biology

Fudenb erg D and D Kreps Exp erimentation Learning and Equilib

rium in Extensive Form Games unpublished notes

Fudenb erg D and D Levine An easier way to Calibrate unpublished

manuscript

Fudenb erg D and D Levine Conditional Universal Consistency un

published manuscript

Hart S and MasCollel A A Simple Adaptive Pro cedure Leading to

Correlated Equilibrium unpublished manuscript

Kalai E and E Lehrer Sub jective Games and Equilibria unpublished

manuscript

Kalai E E Lehrer and R Smoro dinsky Calibrated Forecasting and

Merging manuscript

Kreps D Game Theory and Economic Mo deling Oxford University

Press Oxford a

Kreps D Seminar on Learning in Games University of Chicago b

Milgrom P and J Rob erts Adaptive and Sophisticated Learning in

Normal Form Games Games and Economic Behavior

Monderer D and L Shapley Fictitous Play Prop erty for games with

Identical Interests working pap er Department of Economics UCLA

Miyasawa K On the Convergence of the Learning Pro cess in a

Nonzero sum Twop erson Game Economic Research Program Princeton

University Research Memorandum

Nau R F and K McCardle Coherent Behavior in Nonco op erative

Games Journal of Economic Theory

Oakes D SelfCalibrating Priors Do Not Exist Journal of the Ameri

can Statistical Association

Robinson J An Iterative Metho d of Solving a Game Annals of Math

ematics

Samuelson L and J Zhang Evolutionary Stability in Asymmetric Games

Journal of Economic Theory

Shapley L Some Topics in TwoPerson Games Advances in Game

Theory Princeton University Press Princeton

Skyrms B The Dynamics of Rational Deliberation Harvard University

Press