Coalition Formation Processes with Belief Revision Among Bounded

Home , Bounded rationality

Coalition Formation Pro cesses with Belief

Revision among Bounded-Rational

Self-Interested Agents

Fernando Tohm e

Tuomas Sandholm

Email: ftohme, [email protected]

Department of Computer Science

Washington University

Campus Box 1045

One Bro okings Drive

St. Louis, MO 63130

Phone : 314 935-6160

Fax : 314 935-7302

Abstract

This pap er studies coalition formation among self-interested agents

that cannot make sidepayments. We show that -core stability reduces

to analyzing whether some utility pro le is maximal for all agents. We

also show that strategy pro les that lead to the -core are a subset

of Strong Nash equilibria. This fact carries our -core-based stability

results directly over to two other strategic solution concepts: Nash

equilibrium and Coalition-Pro of Nash equilibrium.

This material is based up on work supp orted by the National Science Foundation under

Tuomas Sandholm's CAREER Award IRI-9703122, Grant IRI-9610122, and Grant I IS-

9800994. Fernando Tohm ewas also supp orted by the Departamento de Econom a of

the Universidad Nacional del Sur, in Argentina and byaFellowship from the National

Research Council CONICET of Argentina. An early version of this pap er app eared in a

workshop [41] 1

The main fo cus of the pap er is to analyze the dynamic pro cess

of coalition formation by explicitly mo deling the costs of communica-

tion and delib eration. We describ e an algorithm for sequential action

choice where each agent greedily stepwise maximizes its payo given

its b eliefs. Conditions are derived under which this pro cess leads to

convergence of the agents' b eliefs and to a stable coalition structure.

We derive these results for the case where the length of the pro cess is

exogenously restricted as well as for the case where agents can cho ose

it.

Finally, we show that the outcome of any communica-

tion/delib eration pro cess that leads to a stable coalition structure is

Pareto-optimal for the original game which do es not incorp orate com-

munication or delib eration. Conversely,anyPareto-optimal outcome

can b e supp orted by a communication/delib eration pro cess that leads

to a stable coalition structure.

1 Intro duction

In manymultiagent settings, self-interested agents|e.g. representing real

world companies or individuals|can op erate more e ectively by forming

coalitions and co ordinating their activities within each coalition. Therefore,

ecient metho ds for coalition formation are of key imp ortance in multia-

gent systems. Coalition formation involves three activities: coalition struc-

ture generation partitioning the agents into disjoint coalitions , solving

each coalition's optimization problem within the coalition, and dividing

the value of each coalition among memb er agents in case of net cost, this

value may b e negative.

Coalition formation among self-interested agents has b een widely studied

in game theory [40, 11 , 2, 1 , 4, 9]. The main solution concepts are geared

toward payo division among agents in ways that guarantee forms of stabil-

ity of the coalition structure. Most of these solution concepts fo cus on the

nal solution, and usually do not address the dynamic pro cess that leads to

the solution. DAI work on coalition formation has intro duced proto cols for

dynamic coalition formation, but the role of strategies and the pro cess itself

This is the usual de nition of coalition structures in game theory.For a non-partitional

de nition see [38]. 2

have not b een included in the solution concept. Although the outcomes sat-

isfy di erent forms of stability, there is often no guarantee that the pro cess

itself is stable, i.e. that individual agents would b e b est o by adhering to

the imp osed strategies [37, 39 ]. Also, it is usually assumed that the agents

can compute the coalition values exactly [43, 37 , 39], but sometimes deter-

mining the value of a coalition involves solving an intractable combinatorial

optimization problem, e.g. solving how a coalition of dispatch centers should

route their p o oled vehicles to handle their p o oled delivery tasks. Some DAI

work has addressed the complexity of coalition value computation by explic-

itly incorp orating computational actions in the solution concept [32, 33 ]. This

allows one to game-theoretically trade o computation cost against solution

quality.However, that work do es not include proto cols for dynamic coalition

structure generation all coalition structures were exhaustively enumerated,

nor do es it address b elief revision.

In this pap er we will analyze coalition formation pro cesses from a norma-

tive p ersp ective. Each agent's decision to participate in a coalition dep ends

on strategic considerations since the parties are self-interested and evalu-

ate p ossible agreements based on the advantages they can get from their

memb ership to a given coalition. Since our approach is normative, it is nec-

essary to identify the motivations of each agent. Utilities constitute a natural

representation for the di erences among agents: utilities are numerical rep-

resentations of their di erent preferences.

In some multiagent systems agents can make sidepayments. On the other

hand, in many settings it is desirable to b e able to handle the interactions

without sidepayments. First, the agents might not have enough money,or

a secure mechanism for transferring money might not exist. Second, most in-

teraction proto cols mechanisms that use sidepayments are only guaranteed

to work if every agent's utility is linear in money, which is not often the case.

This pap er fo cuses on games where agents cannot make sidepayments.

Previous research has mostly fo cused on superadditive games [11 , 43 ].

Sup eradditivity means that any pair of coalitions is b est o by merging into

This is the case, for example, with the classic Clarke tax mechanism [6, 8, 10, 19]. See

also the discussion on mo dulo mechanisms in [20].

2-agent negotiations in MAS have already often b een analyzed in the setting where

agents cannot make sidepayments [27].

However, algorithms for coalition structure generation in non-sup eradditive settings

have also b een presented [34, 32, 37, 38, 39,13]. 3

one. Classically it is argued that almost all games are sup eradditive b ecause,

at worst, the agents in a comp osite coalition can use solutions that they

had when they were in separate coalitions. This pap er will also fo cus on

sup eradditive games. However it should b e p ointed out that not all games

need to b e sup eradditive. Sup eradditivitymay b e violated, for example, if

the setting has anti-cartel p enalties or co ordination overhead such as com-

munication costs or computation costs whichmay increase as the number

of agents in a coalition increases [32 , 33 ]. Since the agents studied in this

pap er are self-interested, the appropriate solution concept is one that em-

phasizes the stability of the coalition structure. This means trying to reach

agreements where no subgroup of agents is motivated to deviate from the

solution. The core is one of such solution concept [19]. The core is usually

used for games with transferable utility [40, 11 ] but it can b e extended to

games with nontransferable utility, for example using Aumann's notion of

the -core [2] which will b e used in this pap er as well.

Anovel asp ect of our approach is the role that b elief formation plays

in the coalition formation pro cess. This pro cess can b e characterized as a

sequence of delib eration computation and communication actions that the

agents take in the dynamic pro cess of coalition formation. Agents' incom-

plete information leads to standard solution concepts, suchasBayes-Nash

equilibrium [19 ] or sequential equilibrium [16 ]. Here we will b e concerned

with a solution concept that combines b oth coalition stability and incomplete

information. The basic idea is that a belief of an agent in a given stage of

the coalition formation pro cess is a conditional probability distribution on

the outcomes, given the previous steps of the pro cess. A coalition structure

obtains stability when the b eliefs of the agents converge. We show that this

coalition structure supp orts a Pareto optimal outcome. The price paid is

tractability: the computation of the optimal coalition formation pro cess can

b e exp onential in the numb er of agents and in the length of the negotiation

pro cess.

A sp ecial case of coalition formation where agents cannot make sidepayments is the

exchange economy, where parties exchange multidimensional endowments. Exchange

economies exhibit sup eradditivity: the b est outcome can b e reached by the grand coalition

consisting of all the agents [21].

Since the problem addressed here is coalition formation with indep endent, self-

interested agents with incomplete information, exp onential complexity seems an almost

unavoidable consequence. See [26] for a pro of of the exp onential complexity inherenteven 4

As is common practice [11 , 39 , 43 , 13 , 32 , 34 ], we start out by studying

coalition formation in characteristic function games CFGs. In such games,

the value of each coalition is indep endent of nonmemb ers' actions. In general

the value of a coalition could dep end on nonmemb ers' actions due to p ositive

and negative externalities interactions of the agents' solutions. Negative

externalities b etween a coalition and nonmemb ers are often caused by shared

resources. Once nonmemb ers are using the resource to a certain extent,

not enough of that resource is available to agents in the coalition to carry

out the planned solution at the minimum cost. Negative externalities can

also b e caused by con icting goals. In satisfying their goals, nonmemb ers

may actually move the world further from the coalition's goal states [27 ].

Positive externalities are often caused by partially overlapping goals. In

satisfying their goals, nonmemb ers may actually move the world closer to

the coalition's goal states. From there the coalition can reach its goals less

exp ensively than it could have without the actions of nonmemb ers. General

settings with p ossible externalities can b e mo deled as normal form games

NFGs. CFGs are a strict subset of NFGs. However, many real-world

multiagent problems happ en to b e CFGs [32]. Also, as we show, the claims

that we will make using the convenient notation of CFGs directly carry over

to NFGs.

The game theoretic context for this work is the Nash program whichwas

originally presented in [23], and which has recently b een widely adopted as

the analysis metho d of choice in computational multiagent systems consist-

ing of self-interested agents [31, 36, 35 , 14]. The idea is that interactions

should b e studied by analyzing stable combinations of strategies|one for

each agent. The sequential pro cess of coalition formation is similar to the

sequential pro cess of bargaining where agents try to reach an agreement

by exchanging prop osals [28]. However, incomplete information intro duces

complications that are related to the credibility of statements and cheap talk:

noncommital statements can induce a multiplicity of equilibria, called \bab-

bling" equilibria [7]. Negotiation has b een prop osed as a solution to this

problem. According to this idea, negotiation using credible not necessarily

in the most ecient decentralized allo cation pro cesses.

These coalition values may represent the quality of the optimal solution for each

coalition's optimization problem, or they may represent the b est b ounded-rational value

that a coalition can get given limited or costly computational resources for solving the

problem [32,33]. 5

truthful statements would lead to coherent agreements where the agents'

b eliefs are mutually consistent [22 ].

Finally, b elief revision in this pap er is based on a representation of b e-

liefs by means of probability distributions. Bayes theorem o ers a to ol for

up dating them. This is a common p oint shared by our approach and b oth

the game theoretical treatments of incomplete information [3], [15 ] and

Bayesian metho ds in arti cial intelligence [24].

The rest of the pap er is organized as follows. Section 2 intro duces the

classic game theoretic framework for coalition formation among agents that

cannot make sidepayments. Section 3 analyzes outcomes statically with the

-core solution concept. Section 4 shows generally that results derived un-

der the -core solution concept carry over directly to strategic solution con-

cepts such as the Nash equilibrium, the Strong Nash equilibrium, and the

Coalition-Pro of Nash equilibrium. Section 5 intro duces the dynamic coali-

tion formation pro cess which incorp orates delib eration and communication.

It shows that stability of the coalition formation pro cess is equivalentto

convergence of the agents' b eliefs for b oth exogenously and endogenously

terminated negotiation, and that the outcome is Pareto-optimal.

2 Games and solutions

This section reviews the concept of a coalition game and an approach for

de ning the value of a coalition characteristic function in games where

nonmemb ers' actions a ect the value of the coalition, and agents cannot

transfer sidepayments. We b egin by de ning a game.

De nition 1 A game G =S ; U is de ned by the set of agents I =

i=1

f1;:::;ng, the set S of possible strategies for each agent i 2 I , and the

resulting utility pro le U : S !<, where for each strategy pro le

i=1

s ;:::;s 2 S ,

1 n i

i=1

U s ;:::;s =u s ;:::;s ;:::;u s ;:::;s

1 n 1 1 n n 1 n

where u : S !< is the utility of agent i.

i i

i=1

A solution concept de nes the reasonable ways that a game can b e played

by self-interested agents: 6

De nition 2 Given a game G =S ; U ,asolution concept in pure

i=1

strategies is a correspondence : G ! S [;, and each

i=1

s =s ;:::;s 2 G is cal led a solution of G.

1 n

The range of a corresp ondence includes the empty set in order to encom-

pass games that do not have a solution of the typ e prescrib ed by the solution

concept.

An example of solution concept is given by Nash equilibria: for each game

G they are elements of G, where is the Nash correspondence:

N N

De nition 3 s =s ;:::;s ;:::;s is in G if for each i and for each

1 i n N

0 0

s 6= s , u s ;:::;s ;:::;s u s ;:::;s ;:::;s . Each such strategy

i i 1 n i 1 i n

i i

pro le is a Nash equilibrium.

In other words, in a Nash equilibrium no agent is motivated to deviate

from its strategy given that the others do not deviate.

De nition 1 characterizes games in terms of the strategies of agents and

the corresp onding payo s. These games are said to b e in normal form. The

normal form is a general representation that can b e used to mo del the fact

that nonmemb ers' actions a ect the value of the coalition [32 , 9 ]. However,

coalition formation has b een mostly studied in a strict subset of normal form

games|characteristic function games|where the value of a coalition do es

not dep end on nonmemb ers' actions, and it can therefore b e represented by

a coalition sp eci c characteristic function which provides a payo for each

coalition T i.e. set of agents [40 , 11 , 43 , 39 ]. Characteristic functions are

a desirable representation, so one would like to de ne such mathematical

entities for normal form games. In such general games, a characteristic func-

tion can only b e de ned by making sp eci c assumptions ab out nonmemb ers'

strategies. In this pap er we follow Aumann's classic approach of making

the -assumption, i.e. assuming that nonmemb ers pick strategies that are

worst for the coalition. Each coalition can lo cally guarantee itself a payo

that is no less than the one prescrib ed by an analysis under this p essimistic

assumption. Later in the pap er we show that the results that we obtain

A pure strategy is a strategy that do es not involve randomization by the agent. De ni-

tion 2 can b e easily extended to mixed strategies that involves randomization, by replacing

each S byS, the set of probability distributions on S .

i i i

The -assumption may b e imp ossibly p essimistic. A given nonmemb er can b e assumed

to pick di erent strategies when di erent coalitions are evaluated. This is in contrast with

the fact that in any realization, the nonmemb er can only pick one strategy. 7

under the -assumption carry over to strategic solution concepts Nash equi-

librium and its re nements that can b e used directly in normal form games

without any assumptions ab out nonmemb ers' strategies.

In games where agents can make sidepayments to each other [37, 39 , 32,

11 ], the characteristic function gives the sum of the payo s of the agents

in a coalition. Instead, our analysis fo cuses on games where agents cannot

make sidepayments. In such games, the characteristic function gives a set

of utility vectors that are achievable [2]. This is in order to provide the

coalition with a set of alternative utility divisions among memb er agents.

The set contains only Pareto-optimal utilityvectors: no agent can b e made

b etter o without making some other agentworse o . The next de nition

formalizes this vector-valued characteristic function under the -assumption.

De nition 4 Given a game G =S ; U , with U such that its compo-

i=1

nents are non-transferable, we say that the characteristic function is

I <

v :2 !2

such that for each coalition T I

v T <

and v T is the set of optimal achievable utilities for T .

The -assumption comes into play in the de nition of these optimal achiev-

able utilities:

De nition 5 Given a game G =S ; U , and a T I , the set of op-

i=1

timal achievable utilities for coalition T = fj ;:::;j g is the set of utility

1 jTj

pro les

U =:::;u ;:::;u ;:::;u ;:::

T j j j

1 2

jTj

such that

9s 2 S ;Us=U

i T

i=1

and

Y Y

T IT

S 69s 2 S : 8s 2

j j

j2T

j62T

T IT

U s ;s = 8

0 0 0 0

U =:::;u ;:::;u ;:::;u ;:::

T j j j

1 2

jTj

0 0

with, for al l j 2 T , u u and for at least one j 2 T; u > u .

i j j

j i j

The next result of ours follows trivially:

Prop osition 1 For every game G in normal form, v exists.

Pro of. Suppose that for a game G =S ; U , v cannot be de ned. So,

i=1

for at least one coalition T , v T cannot be determined. But that means

that a set of U scannot be de ned such that

Y Y

T IT

69s 2 S : 8s 2 S ;

j j

j2T j62T

T IT

U s ;s =

0 0 0 0

U =:::;u ;:::;u ;:::;u ;:::

T j j j

1 2

jTj

0 0

> u . and such that, for al l j 2 T , u u and for at least one j 2 T; u

j i j

Given this condition, we proceed by evaluating U for each s 2 S . If,

i=1

by hypothesis, v T is not determinate, then U is not de ned for every s.

Contradiction. 2

To summarize, the normal form notation emphasizes the strategic asp ects

of the interactions among agents. Each agent's actions are strictly individual

even as a memb er of a coalition. Therefore, a characteristic function has to

indicate, for each coalition, howmuch each of its memb ers gets for joining.

Moreover, the characteristic function has to indicate the b est combinations

of individual payo s that the coalition can get. Prop osition 1 just states that

it is p ossible to de ne this kind of characteristic function for every strategic

game. Once this issue is settled, we can move on to see what kinds of solutions

are relevant for this typ e of coalitional games.

3 The -core and sup eradditivity

The -assumption gives rise to the -core solution concept which de nes

a stability criterion for the coalition structure. The idea is that strategy

pro les that do not have an optimal achievable utility are not candidates for

the solution. A vector of joint strategies, is said to b e blocked by a coalition

if its memb ers can b e b etter o bymoving to another vector: 9

De nition 6 Acoalition T blo cks a strategy pro le s =s ;:::;s if for

1 n

every j 2 T there exists an s 2 S such that

i i

i=1

0 0

8j,u su sand for at least one j , u s >u s; and

i j j i j j

i i i i

0 0 0

s ;::: 2 v T. :::;u s ;:::;u s ;:::;u

j j j

1 2

jTj

The blo cking relation de nes a particular set of stable joint strategies,

the -core. The -core is the set of joint strategies where no coalition can b e

formed if its memb ers are b etter o changing their individual strategies, given

that nonmemb ers pick strategies that are worst for the coalition. In other

words, it is the set of joint strategies for which a stable collective agreement

can b e reached. Formally, the -core corresp ondence is de ned as follows [2]:

De nition 7 A strategy pro le s =s ;:::;s is in the -core if there

1 n C

is no coalition T that blocks s.

As with the Nash corresp ondence, the -core corresp ondence can b e

empty for some games. The following example demonstrates this.

Example 1 G =S ;S ;U, where the set of players is fa; bg

a b

S = S = fc; ncg

a b

and

U = fhc; nci; h0; 10i; hc; ci; h5; 5 i; hnc; ci ; h10 ; 0i; hnc; nci ; h2; 2 ig

where hs ;s i;hu s ;u s i is the general form of the elements of U . This

a b a a b b

is an instance of the prisoner's dilemma [18]. The corresponding values of

the characteristic function are:

v fag= fh10; 0ig

v fbg=fh0; 10ig

v fa; bg= fh5; 5ig

It is easy to see that

fagblocks fhc; ci; hc; nci; hnc; ncig 10

fbgblocks fhc; ci; hnc; ci; hnc; ncig

fa; bg blocks fhnc; ci; hc; nci; hnc; ncig

Therefore, thereisno hs ;s i that is not blockedbyatleast one coalition. In

a b

other words, G is empty.

Now, under what conditions do es a stable coalition structure exist? In

other words, what are the conditions for non-emptyness of the -core? In

the rest of this section we will show that surprisingly simple conditions are

necessary and sucient for stability. The concept of superadditivity will b e

used to develop an intuition of this phenomenon. Sup eradditivity implies

that anytwo coalitions are b est o merging:

De nition 8 A game G is sup eradditive if given any two coalitions T ;T ,

1 2

T \ T = ;, v T \ v T v T [ T .

1 2 1 2 1 2

The following result of ours establishes an interesting prop erty that re-

lates characteristic functions and sup eradditivity: if the intersection of the

optimal ly achievable utilities for al l the players is not empty, then the game

is superadditive. This condition on characteristic functions will b e later used

to discuss stability.

Prop osition 2 For a game G,if v fig6=;then G is superadditive.

i2I

Pro of. We wil l prove this result by induction on the cardinality of coalitions:

Given T ;T , T \ T = ;, jT j =1, jTj=1, it is clear that there

1 2 1 2 1 2

exist agents i; j 2 I such that T = fig and T = fj g. If U =

1 2

u ;:::;u 2 v fig, then in particular U 2 v T \ v T .

1 2

i2I

1 n

Suppose U 62 v T [ T . Then, there exists s 2 S such that

1 2 i

i=1

U s=:::;u s;:::;u s;::: and u s u and u s u with

i j i j

i j

strict inequality for one of them, say i. But then, U 62 v fig,con-

tradiction. So, v T \ v T v T [ T .

1 2 1 2

This de nition byShubik [40] for games without sidepayments di ers technically

from the de nition of sup eradditivity for games with sidepayments [40, 11, 32, 43, 37, 39].

However, they are conceptually the same. 11

Assume that U 2 v T \ v T v T [ T , for any pair of

1 2 1 2

coalitions T ;T , T \ T = ;, jT [ T jk

1 2 1 2 1 2

0 0

i 62 T ;i 62 T , and T = T [fig. Then T \ T = ; and of course

1 2 1 2

1 1

0 0

U 2 v T by the inductive assumption because jT jk, so U 2

1 1

0 0

v T \ v T . Suppose that U 62 v T [ T .Again, this means that

2 2

1 1

exists s 2 S such that U s=:::;u s;:::;u s;:::, where

i j j

i=1 k

0 0

T [ T = fj ;:::;j g, and u s u for al l j 2 T [ T , with strict

2 1 k j i 2

1 i j 1

inequality for one of them, say j . Suppose without loss of generality

0 0

that j 2 T , but then, U 62 v T . Contradiction.

0 1 1

So, v T \ v T v T [ T for any pair of coalitions T ;T , T \T = ;,

1 2 1 2 1 2 1 2

jT [ T j n. That is, G is superadditive. 2

1 2

To see that sup eradditivity is a necessary but not a sucient condition

for v fig 6= ;, recall the prisoner's dilemma of Example 1. It is a

i2I

sup eradditive game, but v fag and v fbghave no element in common.

Our next result relates the condition of the previous prop osition to sta-

bility of the coalition structure non-emptyness of the -core:

Lemma 1 For a game G, G 6= ; i v T 6= ;.

T 22 ;

Pro of.

! If G 6= ;, then there exists an s that is not blockedbyany

coalition. But then, by the de nition of blocked joint strategy, it is

clear that for each coalition T , U s 2 v T , and therefore s 2

v T

T 22 ff;g

If v T 6= ;, then there exists at least one U 2 v T

T 22 f;g

for every possible coalition T and thereforeans2 S such that

i=1

U s=U . So, s is not blocked by any coalition and thus s 2 G. 2

This lemma is useful for proving the following theorem of ours. The

theorem shows that to characterize the stability of the coalition structure

in terms of the -core, only the utilities and the corresp onding actions of

individual agents are required. One do es not need to compare utilities and

actions of coalitions.

Theorem 1 For a game G, G 6= ; i v fig 6= ;.

i2I 12

Pro of.

! If G 6= ;, there exists a s 2 S such that no coalition

C i

i=1

blocks it. So for each coalition T , U s 2 v T .Inparticular for al l

the coalitions with a single member, T = fig. So, U s 2 v fig.

i2I

By Lemma 1, it is enough to prove that v T 6= ;. The

T 22 f;g

proof wil l be by induction on the size of coalitions:

{ given that by hypothesis 9U 2 v fig, then for each i 2 I ,

i2I

U 2 v fig

{ assume that for each coalition T with jT j = k

For any i 2 I; i 62 T by proposition 1:

U 2 v T \ v fig v T [fig

0 0 0

So, U 2 v T , for any T such that jT j = k +1. Therefore,

00 00

U 2 T for any T 2 2 f;g. 2

Since, for each agent i, v fig represents that agent's optimal achievable

utilities, this result states that the non-emptyness of the -core is equivalent

to the existence of at least one utilityvector that is maximal for every agent.

This vector, say U ,isPareto-optimal: there is no other U such that for all

i, U U , with strict inequality for at least one i.

It follows from Theorem 1 that in games without sidepayments, the coali-

tion structure can b e stable only if every p ossible pair of coalitions is b est

o merging [ -core 6= ;] sup eradditivity. This di ers from games with

sidepayments: there the core can b e nonemptyeven if the game is not su-

p eradditive [32]. On the other hand|as in games with sidepayments|the

coalition structure may b e unstable even if every pair of coalitions is b est o

merging sup eradditivity 6 [ -core6= ;].

These results show that the -core is nonempty only in sup eradditive

games. In all other games there exists at least one coalition that blo cks

the solution. While this result is interesting per se, it could p erhaps b e

interpreted as a criticism of the -core solution concept itself. However, with

self-interested agents with non-transferable utilities, all market-like games

are sup eradditive [21]. Market economies are natural examples of this typ e

of games and electronic economies constitute a natural eld of application

for the results of this pap er. 13

4 Relationships between axiomatic and

strategic solution concepts

In this section we present some new relationships b etween axiomatic and

strategic solution concepts. The imp ortance of these relationships lies in the

fact that they allow us to imp ort the other results of this pap er which will

b e derived for the axiomatic -core solution concept directly to strategic

solution concepts like Nash equilibrium and its re nements.

The notion of the -core is axiomatic in that it only characterizes the

outcome without a direct reference to strategic b ehavior. The Nash corre-

sp ondence is, instead, a strategic solution concept: it is based only on the

self-interested strategy choices of agents. Sp eci cally, it analyzes what an

agent's b est strategy is, given the strategies of others. A strategy pro le is

in Nash equilibrium if every agent's strategy is a b est resp onse to the strate-

gies of the others. Nash equilibrium do es not account for the p ossibility

that groups of agents coalitions can change their strategies in a co ordi-

nated manner. Aumann has intro duced a strategic solution concept called

the Strong Nash equilibrium to address this issue [1, 4 ]. A strategy pro le is

in Strong Nash equilibrium if no subgroup of agents is motivated to change

their strategies given that others do not change:

De nition 9 A strategy pro le s 2 S , in a game G,isaStrong Nash

i=1

equilibrium if for any T I and for al l s 2 S there exists an i 2 T

j 0

j 2T

T IT

such that u s u s ;s .

i i

0 0

This concept gives rise to the Strong Nash corresp ondence, , i.e. the

set of Strong Nash equilibria. We show a close relationship b etween the

Strong Nash solution concept and the -core solution concept:

Theorem 2 For any game G, G G.

C SN

Pro of. Suppose that s 2 G but s 62 G. Then, there exist an T I

C SN

T T IT

and an s 2 S such that for al l j 2 C , u s

j j j

j 2T

0 0

means that U s is not in v T , so thereisas such that U s 2 v T and

u s u s. Contradiction. 2

j j

Often the Strong Nash equilibrium is to o strong a solution concept, since

in many games no such equilibrium exists. Recently, the Coalition-Proof 14

Nash equilibrium [4] has b een suggested as a partial remedy to this problem.

This solution concept requires that there is no subgroup that can makea

mutually b ene cial deviation keeping the strategies of nonmemb ers xed

in a way that the deviation itself is stable according to the same criterion.

A conceptual problem with this solution concept is that the deviation may

b e stable within the deviating group, but the solution concept ignores the

p ossibility that some of the agents that deviated may prefer to deviate again

with agents that did not originally deviate. Furthermore, even these kinds

of solutions do not exist in all games. On the other hand, in games where a

solution is stable according to the -core, the solution is also stable according

to the Coalition-Pro of Nash equilibrium solution concept. This is b ecause

G G which follows from our result G G and the

C CP N C SN

known fact that G G.

SN CP N

We can also relate the Nash equilibrium itself to the -core this could al-

ternatively b e deduced from Theorem 2 and the fact that G G:

SN N

Theorem 3 For any game G, G G.

C N

Pro of. Given s 2 G, we wil l show that s is a Nash equilibrium in

pure strategies for G. Suppose not. By Theorem 1 it is enough to con-

sider what happens with single individuals. Then, for a i 2 I , given

the vector s ;:::;s ;s ;:::;s , the b est resp onse for i is s with

1 i 1 i +1 n 0

0 0 i

u s ;:::;s ;:::;s >u s. But that means that fig blocks s, and there-

i 1 n i

0 i 0

fore s 62 G.Contradiction. This proves that G G . Example

C C N

1 shows that the converse is not true: G= fhnc; ncig and G=;.

N C

Therefore G G.2

C N

An implication of the results in this section is that the other results of

this pap er which are derived for the -core carry over directly to analyses

that use strategic solution concepts Nash equilibrium, Coalition-Pro of Nash

equilibrium or Strong Nash equilibrium. Sp eci cally,any solution that is

stable according to the -core is also stable according to these three solution

concepts.

Another implication is that to verify that a strategy pro le is in the -

core, one needs to only consider strategy pro les that are Pareto-optimal

The non-emptyness of the -core is equivalent to the existence of a utilityvector U

which is common to all sets v fig for all agents i. By de nition 4, this means that there

and in Nash equilibrium. Alternatively, one can restrict this searchtoPareto-

optimal Strong Nash equilibria or Pareto-optimal Coalition-Pro of Nash equi-

libria.

5 Bounded rationality in coalition formation

In the previous section it was assumed that delib eration is costless. To

relax this assumption weintro duce delib eration and communication actions

explicitly into the mo del:

De nition 10 For each agent i in a game G, let D be a set of delib era-

tion/communication activities that i can perform in order to choose a strategy

s to be executed. Each d 2 D is associated with a C d , i.e. the cost for

i i i i i

iofperforming the activity d .

In order to avoid unnecessary complications, we assume that C can b e

expressed in the same units as u . We will not assume any sp ecial structure

on D , except the following:

De nition 11 For each agent i,weconsider its process of communica-

0 1 t

tion/delib eration fa ;a ;::: a g, where a 2 D , for t =0;1;:::;t 1,

i i

i i i

and a 2 S .IfN = max t , we say that the coalition structure has been

i i2I i

formedin N steps. For any i and t such that t t

The idea b ehind this de nition is that the agents delib erate and exchange

messages until each one decides on a strategy to follow. We also assume that

this pro cess is nite and each agent stays commited to its choice once it has

reached a decision.

We use a general characterization of the communication/delib eration pro-

cess without going into the details of how an action leads to another one e.g.

0 0

do es not exist a U such that U U for all i, with strict inequality for at least one

i. That is, U is Pareto-optimal. Therefore one can restrict the searchtoPareto-optimal

outcomes.

Implicit in this de nition is the existence of another level of delib eration, assumed

costless, needed for cho osing delib eration/communication actions. Although strong, this

assumption cannot b e dropp ed without leading to a p otential in nite regress [17, 32, 29,

5, 30, 42].

Activities in D and strategies in S will b e called the actions of agent i in the pro cess.

i i 16

how delib eration actions lead to the choice of physical actions. Wesay that

the payo of agent i in an N -p erio d pro cess is determined as follows:

De nition 12 If the communication/deliberation process of agent i is a^ =

t t

0 i i

fa ;:::;a g a = s , the payo is

i i

^a ;:::;a^ = u s ;:::;s ;:::;s C a +::: + C a C N t

i i n i 1 i n i i i i

where C > 0 is the waiting cost, which is assumedconstant per time unit.

As this de nition states, we assume that costs of activities are indep en-

1 N

dent: if the pro cess is ^a =a ;:::;a , its cost C ^a is equal to the sum of

i i i

i i

the costs of the activities, C a +::: + C a +C N t .

i i i i

A new game can b e de ned which explicitly considers the delib eration

and communication actions as part of each agent's strategy. This follows the

approach of Sandholm and Lesser [32, 33 ] in the sense that such actions

are explicitly incorp orated into the solution concept. It di ers from other

DAI approaches to coalition formation where the solution concept stability

criterion is only applied to the nal outcomes [43, 37 , 39 ].

n t

De nition 13 Given G, fD g and t> 0, a new game is de ned, G =

i=1

t n t n

D S ;P, where P : D S !< such that for each ^a 2

i i

i i=1 i=1 i

D S , P â= â ;:::; â .

i 1 1 n n

i=1 i

The length of the game, t, dep ends on the available communica-

tion/delib eration activities and on the sequential choice of activities. For

nowwe assume that the time limit is given in Subsection 5.2 we will relax

this assumption. To justify this, we supp ose that each agent, i, has a degree

of impatience, given by a maximum time, t , to make a nal decision.

In order to maximizepayo s, our self-interested agents engage in negoti-

ations. The nal outcomes represent the result of agreements among agents.

Coalitions are formed during the negotiations. A particular pro cess gener-

ates a stable coalitional structure if the nal outcome in the original game

with strategy pro le space S cannot b e blo cked by a coalition formed

i=1

in another pro cess. Formally,

0 t

De nition 14 Aprocess a^ = fa ;:::;a g2 D S leads to a stable

i=1

coalition structure if a 2 S cannot be blocked by any coalition formed

i=1

0 0 0

0 t

= fa ;:::;a g. in another process a^ 17

The relationship b etween stable coalition structures and the -core is

given by the following lemma.

0 t

Lemma 2 If the process a^ = fa ;:::;a g2 D S is in the -core

i=1 i

of game G then a^ leads to a stable coalition structure.

Pro of. Suppose that there exists a coalition T for which there exists an

t t

s 2 S such that u s u a for j 2 T and u s >u afor

i j j j j

i=1

0 0

j 2 T . Then ^a = fa ;:::;sg isaprocess in which T is formed and obtains

s, where s = s and P â P â, for j 2 T . Contradiction because â is

j j j j

. 2 in the -coreofG

We can easily restate the notions given in Section 2 in order to nd

conditions for the stability of the coalition structure. First, we de ne the

characteristic function for G , v , via replacing the optimal achievable util-

ities by the optimal achievable payo s which incorp orate delib eration and

communication:

t t n

De nition 15 Given G =D S ;P, and T I , the set of optimal

i i=1

achievable payo s for T = fj ;:::;j g is the set of P s such that

jTj

there exists ^a 2 D S and P ^a=P, and

i=1 i

0 0

69a^2 D S such that P ^a=P with P P for al l j 2 T ,

i j i

i=1 i j

>P for at least one j 2 T . and P

This means that P is an optimal achievable payo for coalition T if there

is no other payo vector such that the payo is no worse for any member

and it is b etter for at least one|for all in particular for the worst pro cesses

that nonmemb ers can pick. The following prop osition shows the analog of

Theorem 1 for the game that includes delib eration/communication:

Prop osition 3 a^ 2 D S is such that P ^a 2 v fig for each i

i G

i=1

i a^ is in G .

Pro of. Immediate from Theorem 1. 2

This means that a communication/delib eration-action pro cess in the -

core corresp onds to a Pareto-optimal payo vector. In the rest of this pap er 18

we will fo cus on -core pro cesses since they are not only Pareto-optimal in

G but also lead according to Lemma 2 to an outcome s 2 S that

i=1

cannot b e blo cked by a coalition formed in any other pro cess.

For many games the prisoner's dilemma b eing an example a Pareto-

optimal payo can b e reached only through the co ordinated activityof

agents, i.e. the Pareto-optimal outcome is reached only if each agent acts

in agreement with the other agents. Let us return to Example 1 to show

how communication/delib eration activities allow to reach a co ordination on

aPareto-optimal outcome:

Example 2 Consider again the game G of Example 1, and assume that the

communication/deliberation actions at every step are

0 00

D = D = fd; d ;d g

a b

and the associate costs are

C d=C d=0:1

a b

0 0

C d= C d =0:1

a b

00 00

C d = C d =0:5

a b

Say that the actions are abstractly described as fol lows:

d: engage in negotiations

d :reach an agreement

d : sign a contract to enforce the agreement

Moreover, we assume that engaging in negotiations, d, has as a consequence

the realization that if no enforceable agreement is reached, the outcome wil l

be hnc; nci.

0 0 00 00

The sequence hd; di; hd ;d i;hd ;d i; hc; ci is in the -corebecause the

payo for every player, 5 0:1+0:1+0:5, is higher than any other payo ,

considering that d is an unavoidable step in the process. If an agent would

choose the process fd; ncg it would know that the other agent would do the

same, and the payo would be only 2 0:1. 19

5.1 Incorp orating b elief revision

The previous example is very simple, but it shows the sequential nature of

an agent's choice of action. This subsection intro duces a more sophisticated

decision making mo del for an agent that takes part in coalition formation.

This mo del is used to show results on the joint outcomes and the joint pro cess.

Tocho ose the action that maximizes exp ected payo at each step, an

agentmay need to evaluate the exp ected payo s of di erent actions. We

will show when this pro cedure leads to the formation of a stable coalition

structure. To give a mathematical characterization, weintro duce the notion

of \exp ected payo " which is based on each agent's sub jective probabilities:

De nition 16 Given a sequence of actions performed by an agent i, a^ =

t+1

0 t

a ;:::;a 2 D , we say that agent i can de ne a subjective probability

i i

t+1

distribution on S such that sja is the conditional probability of

i=1 i

t+1

an outcome s, given that the next action is a .Aprobability distribution

on the total costs associated with the process to reach s 2 S can be also

i=1

t+1

de ned: C sja is the conditional probability of a cost C s, given that

i i

t+1

the next action is a . Then the expectedpayo , given that the next action

t+1

is a ,is

X X

t t+1 t t+1 t t+1

a = u s sja C s C sja

i i i

i i i i i i

Q Q

n n

s2 S C s;s2 S

i i i

i=1 i=1

An agent can try to maximize its exp ected payo in each step, i.e. to

t t

cho ose an a 2 D [ S that maximizes . This is a greedy pro cedure,

i i

and agents that use it may not always converge on a joint solution.

Akey element here is the b elief formation pro cess that generates the

t+1 t+1

t t

conditional probabilities sja and C sja . We assume in the

i i

Q Q

n n

following that S , D and the length of the pro cess, N , are common

i i

i=1 i=1

knowledge. Agents are assumed to up date their b eliefs using Bayes Rule [9].

The general decision pro cess is as follows:

Algorithm 1 At stage k =0 agent i generates a probability distribution

Q Q Q Q

n n n n

N 1 N 1

over D S , such that for each ^a 2 D S ,

i i i i

i=1 i=1 i=1 i=1

0 14 15

^a > 0. ,

14 0

Tocho ose a , agent i p erforms steps 2;:::;6 with k +1 = 0

Bayesian up dating requires p ositive probabilityofevery pro cess: if Bayes Rule is

applied, a pro cess with zero probability cannot get a p ositive probability thereafter. 20

For k =1:::N 1:

k k k

1. The last common action observedisa =a ;:::;a . Then, for every

1 n

k +1

0 N

a =a ;:::;a , the new distribution is such that

k +1 k k

a=0 if a 6= a

k k k +1

if a = a and a=

Q Q

n n

k +1 a

N 1

S : 6=0 D a2

i i

i=1 i=1

2. For each a 2 D , j =1;:::;jD j,

i i

k+1 k +1

sja = a

i i

k+1

a =a ;a =s

i i

3. For each a 2 D , j =1;:::;jD j,

i i

k+1 k+1

sja = a

i i

k +1

N 0 N

such that a = a , a = s and C s= C a ;:::;a

i i

i i i i

j j

k +1

4. For each a 2 D , the expectedpayo a is

i i i

X X

j j j

k +1 k +1 k +1

a = u s sja C s C sja

i i i

Q Q

n n

s2 S C s;s2 S

i i i

i=1 i=1

j j

k +1

5. Agent i selects the action a that maximizes the expectedpayo a

i i i

6. The action to perform is the a found in the previous step.

In words: each agent considers, at k = 0, the set of all p ossible pro cesses

and cho oses an action. After observing the actions of the other agents, it

discards all pro cesses that at the rst stage di er from the observed set of

actions. Then it cho oses an action according to its new b eliefs. The pro cess

is rep eated until in stage N 1 it has to cho ose a domain action in the

space S .At each stage, the set of p ossible pro cesses and therefore the set of

p ossible outcomes is narrowed, until a single outcome is pinp ointed. 21

Algorithm 1 has, in each stage, aworst case complexityof4B+

Q Q Q

n n n

2 N 1

jDjj S jB + B , where B = j D S j is the number

i i i i

i=1 i=1 i=1

of p ossible pro cesses. If D = D for all i and j , the worst case complex-

i j

ity b ecomes p olynomial in jD j . This clearly shows that Algorithms 1 is

exp onential in the numb er of agents and the length of negotiations.

When this pro cedure is p erformed in conjunction with co ordination

among agents in the sense that they happ en to follow a pro cess that is

in the -core, they will converge to the b elief that a particular outcome s is

the most probable one later we show that s is Pareto-optimal.

0 t N

Prop osition 4 If a^ =a;:::;a ;:::;a is in the -core Proposition 4

t t t

showed that this means that for each t, a =a;:::;a is the vector of

1 n

optimal decisions then 9m such that for t> m there exists an s 2 S

i=1

which gives the max sja for each i.

s2 S i

i=1

Pro of. If a^ is in the -core, P ^a is in v N fig by Theorem 1. Suppose

that for each m there exists t>m such that theredoes not exist s 2 S

i=1

which gives the max sja . In particular, given m = N 1,

s2 S i

i=1

N 1

for t = N theredoes not exist s giving max sja . So, it

s2 S i

i=1

means that at least one agent wil l deviate, making another strategy pro le

moreprobable. Then, since a 2 S it is clear that for at least one agent

0 0 0 0 0

1 N N N

i , ^a > ^a, where a =a ;:::;a , a = a = s for j 6= i and

i i j

j j

N N

a 6= a = s . Contradiction because P ^a 2 v fi g. 2

i i

The converse is not true. A pro cess that leads to a stable coalition struc-

ture might not b e in the -core. It is intuitive that a stable structure can

b e formed in a cost-inecient pro cess. This pro cess could b e blo cked by an-

other one leading to the same coalition structure, thus preserving stability.

Therefore, Prop osition 4 only gives a necessary condition for a pro cess to b e

an element of the -core. However, this is all we need since the following

result shows that a coalition structure is stable if it leads to convergence of

b eliefs ab out the strategy pro le to b e chosen.

This seems to b e a discouraging result. However, it is p erhaps less so in lightofthe

fact that in economic pro cesses where agents have indep endent preferences, exp onential

complexity in the numb er of agents or in the length of pro cesses is p ervasive [26]. 22

N 0 N

Theorem 4 For G ,aprocess ^a =a;:::;a leads to a stable coali-

tion structurei 9msuch that 8t > m 9s 2 S that gives the

i=1

sja for each i. max

S i s2

i=1

Pro of.

!Trivial: if a coalition T formed in a stage t m can block ^a

0 0

N N N

it means that there exists a^ 2 S , such that u a u a

i j j

i=1

0 0

N N N

for j 2 T and for one j 2 T , u a u a and a maximizes

j j

N 1

. Contradiction because the optimal decision in N is a .

Suppose that for al l m there exists t> msuch that thereisno

n t1

s2 S that gives the max sja for each i. If so, for

i=1 i i

s2 S

i=1

N 1

s > m = N 1, there exists i and s 6= a verifying that

N 1

N N

a , but then, fi g isacoalition that blocks a . Contradiction.

Alternatively, this result can b e stated in a more detailed way b ecause

utilities and costs are indep endent:

N 0 N

Theorem 5 In G ,aprocess a^ =a;:::;a leads to a stable coali-

tion structurei 9msuch that 8t> m and for each i, there exists a pair

t1 t1

N t

a ;a 2 S D such that u j and C C j achieve

i i i i i

i i

i i=1

N t 17

a maximum and a minimum respectively in a ;a .

Pro of.

!Assume that for al l m there exists i 2 I such that for al l a 2

t1 t1

D [ S there exists t> m for which u j and C C j

i i i i i

i i

N t

;a . Taking do not achieve a maximum and a minimum in a

m = N 1 it fol lows that there exists s 2 S such that

i=1

N 1 N 1 N 1

N N N N

u s sjs >ua a ja and C s C sja <

i i i i i

i i i

i i

N 1

N N N N

C a C a ja . Therefore i can block a . Contradiction.

i i

A trivial example where this conditions can b e ful lled is that of a game G with a

unique Nash-equilibrium that is also Pareto-optimal: each individual action chosen will

b e one that leads to that maximal outcome and will therefore minimize costs. There are

other examples as well. 23

suppose that there exists a coalition T I that can be formed

in another process ^a . Therefore for at least one i 2 T , there exists

N 1 N 1

N N N N

s 2 S s = a such that u s sjs >u a a ja

i i i i i

i i

i=1 i i

N 1 N1

N N

and C s C sjs

i i i i i i

i i

t1 t1

N N

cause for a ;a u j and C C j achieve a maxi-

i i i

i i

mum and a minimum. 2

Put together, Theorem 5 gives the conditions under which Algorithm 1

leads to a stable coalition structure. Sometimes the process that leads to

this coalition structure is stable according to the -core and sometimes not.

In either case, stability is equivalent to the coincidenceofbeliefs about the

process. That is, if the negotiation do es not lead the agents to share the

same exp ectations ab out the nal outcome, the result will not b e supp orted

by a stable coalition structure.

To see how this works, let us revisit Example 2, this time incorp orating

b elief revision:

Example 3 Consider the game G of Example 2, and assume that each agent

can choose among eight possible sequences of actions each sequence has at

most four stages:

0 00

I d; d ;d ;c

0 00

II d; d ;d ;nc

III d; d ;c

IV d; d ;nc

0 00

V d ;d ;c

0 00

VI d ;d ;nc

VII d ;c

V III d ;nc 24

We assume that agents a and b have symmetric prior probability distri-

0 18

butions. Speci cal ly, for i = a; b, the distribution is as fol lows:

0 0

sequences where the rst action is d: I =0:512; III= 0:128;

i i

0 0

II=0:128; IV = 0:032.

i i

0 0

sequences where the rst action is d : V =0:08; VII=0:02.

i i

0 0

sequences where the rst action is d : VII= 0:08; V III=

i i

0:02.

It is immediate that

1 0 0

hc; ci;d= I+ III= 0:64

i i i

1 0 0

hnc; nci;d= II+ IV =0:16

i i i

1 0

hc; ci;d = V=0:08

i i

1 0

hnc; nci;d = VI=0:02

i i

1 0

hc; ci;d = VII= 0:08

i i

1 0

hnc; nci;d = V III=0:02

i i

Let the costs for i = a; b be

C hc; ci=0:7 with probability 0:512if i chooses sequence I

C hc; ci=0:6 with probability 0:128if i chooses sequence III

We assume that all other p ossible cases have probability 0. This is contrary to Algo-

rithm 1, but it simpli es the exp osition. 25

C hc; ci=0:6 with probability 0:08if i chooses sequence V

C hc; ci=0:5 with probability 0:08if i chooses sequence VII

Note that C hc; ci varies depending on the communication/deliberation

process that leads to the domain actions hc; ci.Wecan also include a \breach-

ing cost" into the model, say 0:6. This means that if the agents agreeona

particular pro le of domain strategies, but some agent deviates, then that

agent has to pay the breaching cost. With the breaching cost includedwe

have:

C hnc; nci=1:3 with probability 0:128if i chooses sequence II

C hnc; nci=1:2 with probability 0:032if i chooses sequence IV

C hnc; nci=1:2 with probability 0:02if i chooses sequence VI

C hnc; nci=1:1 with probability 0:02if i chooses sequence V III

Therefore, the agent wil l have the fol lowing beliefs about costs:

C hc; cijd=0:64 choosing sequences I or III

C hnc; ncijd=0:16 choosing sequences II or IV

C hc; cijd =0:08 choosing sequence V

C hnc; ncijd =0:02 choosing sequence VI

C hc; cijd =0:08 choosing sequence VII

i 26

C hnc; ncijd =0:02 choosing sequence V III

Final ly, the possible expectedpayo s in the rst iteration are, for i = a; b:

1 1 1

d = u hc; ci hc; ci;d+u hnc; nci hnc; nci;d

i i

i i i

1 1

C hc; ci C hc; cijd C hnc; nci C hnc; ncijd

i i i i

i i

= 5 0:64 + 2 0:16 0:7+0:6 0:64 1:3+1:2 0:16 = 2:288

0 0 0

1 1 1

d = u hc; ci hc; ci;d +u hnc; nci hnc; nci;d

i i

i i i

0 0

1 1

C hc; ci C hc; cijd C hnc; nci C hnc; ncijd

i i i i

i i

= 5 0:08 + 2 0:02 0:6 0:08 1:2 0:02=0:368

00 00 00

1 1 1

d = u hc; ci hc; ci;d +u hnc; nci hnc; nci;d

i i

i i i

00 00

1 1

C hc; ci C hc; cijd C hnc; nci C hnc; ncijd

i i i i

i i

= 5 0:08+2 0:02 0:5 0:08 1:1 0:02 = 0:378

So, according to Algorithm 1, both agents choose action d. Therefore

only sequences I to IV are stil l possible. Probabilities arere-evaluated us-

ing the fact that the total probability of sequences beginning with d was

0 0 0 0

I + II+ III+ IV = 0:8, thus assigning probability 0 to

i i i i

processes V; V I ; V I I and V III:

0:512

I= = =0:64

0:8 0:8

III

0:128

III= = =0:16

0:8 0:8

0:128

= =0:16 II=

0:8 0:8

0:032

IV = = =0:04.

0:8 0:8

Therefore, the beliefs about the nal domain strategies are

2 1

hc; ci;d = I= 0:64

i i 27

2 1

hnc; nci;d = II= 0:16

i i

2 1

hc; ci;d = III= 0:16

i i

2 1

hnc; nci;d = IV =0:04

i i

and the beliefs about the corresponding costs are

1 2

= I= 0:64 C hc; cijd

i i

2 1

C hnc; ncijd = II=0:16

i i

2 1

C hc; cijd = III= 0:16

i i

2 1

C hnc; ncijd = IV = 0:04

i i

The expectedpayo s are:

0 0 0

2 2 2

d = u hc; ci hc; ci;d +u hnc; nci hnc; nci;d

i i

i i i

0 0

2 2

C hc; ci C hc; cijd C hnc; nci C hnc; ncijd

i i i i

i i

= 5 0:64 + 2 0:16 0:7 0:64 1:3 0:16=2:864

00 00 00

2 2 i

d hc; ci;d hnc; nci;d = u hc; ci +u hnc; nci

i i

i i a

00 00

2 2

C hc; ci C hc; cijd C hnc; nci C hnc; ncijd

i i i i

i i

= 5 0:16+2 0:04 0:7 0:16 1:3 0:04 = 0:716

Now, both agents wil l choose d . Therefore only sequences I and II remain

feasible. Since the only possibility is to choose d ,wecan directly proceedto

the fourth iteration of Algorithm 1. The probabilities are since I +

II=0:8

0:64

3 2

I= I= = =0:8

i i

0:8 0:8 28

0:16

3 2

II= II= = =0:2.

i i

0:8 0:8

Therefore the beliefs become:

4 3

hc; ci;c= I= 0:8

i i

4 3

hnc; nci;nc= II= 0:2

i i

4 3

C hc; cijc= I= 0:8

i i

4 3

C hnc; ncijnc= II= 0:2.

i i

Now the expectedpayo s are:

4 4 4

c = u hc; ci hc; ci;c C hc; ci C hc; cijc

i i i

= 5 0:8 0:7 0:8=3:44

4 4 4

nc = u hnc; nci hnc; nci;nc C hnc; nci C hnc; ncijnc

i i i

= 2 0:2 1:3 0:2=0:14

So c is the chosen domain strategy for each agent. Therefore, the process

I I each agent choosing sequence I is stable according to Theorem 4:

0 00

1 2 3

hc; cijd; hc; cijd and hc; cijd were maximal among the condi-

i i i

tional probabilities of strategies in G given actions taken from D .

To summarize, even in the presence of uncertainty, if b eliefs at the b e-

ginning of the delib eration/communication pro cesses are \reasonable", the

agents will converge to a pro cess in the -core. As Example 3 shows, Algo-

rithm 1 leads to a shared b elief among the agents. This shared p oint of view

supp orts a stable pro cess|the same one as the one leading to co op eration

in Example 1. 29

5.2 Delib eration/communication pro cesses of di erent

lengths

The results of the previous section are highly dep endent on the length of

the pro cess: two pro cesses ^a and ^a are comparable only if their lengths are

the same. If not, Theorems 4 and 5 cannot b e applied. If we maintain

that the degree of impatience of each agent, t , is given b eforehand, it is

clear that the game has a de nite length max t .Even if not, a condition

i2I i

on the convergence of b eliefs can b e given. The following result shows that

every convergent pro cess in the sense that agents agree in their b eliefs ab out

the nal outcome, leads to a stable coalition structure in an endogenously

de ned timing. In other words, there always exists a pro cess that provides

the outcome on which agents agree, and the length of the pro cess is nite

even if it is not given exogenously. This result is indep endent of the b elief

up dating mechanism used by the agents.

Theorem 6 If 9m such that 8i and 8t > m, u j and

C C j have a maximum and a minimum respectively in a pair

i i

t N

s ;a , then there exists a nite N such that a = s .

Pro of. Given that for each i, for each t > m, u j and

C C j have a maximum and a minimum respectively in a pair

i i

t i

s ;a , we know that for every i, there exists t such that s = a otherwise

i i

the process a^ is in nite and C s !1 and therefore ^a !1. For

i i i

t> t , a is the action of waiting for the decision of the other agents. Then,

taking N = max t ,wesee that a = s . 2

i i

What this result indicates is simply that if a negotiation has to end suc-

cessfully, some agents havetowait until the rest of the negotiators make

their decisions. Therefore, an externally given end-time is unnecessary in

successful negotiations as well as in unsuccessful ones.

5.3 Relating outcomes of rational and b ounded ratio-

nal agents

Communication/delib eration pro cesses were intro duced in order to describ e

the dynamic formation of coalition structures in the original game G. Theo-

rem 7 shows that the outcome of a pro cess that converges to a stable coalition 30

structure is Pareto-optimal part of the Pareto or eciency frontier in the

original game G. Conversely,anyPareto-optimal outcome can b e supp orted

by a pro cess that converges to a stable coalition structure. Formally, the

relationship b etween outcomes in a stable coalition structure formed in a

communication/delib eration pro cess, and outcomes achieved by p erfectly ra-

tional agents in the original game G is as follows:

1 N

Theorem 7 Aprocess a^ =a ;:::;a leads to a stable coalition structure

i theredoes not exists s 2 S , such that u s u a for al l i and

i i i

i=1

u s >u a for at least one i .

i i

Pro of.

!Suppose that there exists s 2 S , u s u a for al l i

i i i

i=1

and at least for one i , u s >u a . Then, another process a^ =

i i

0 0 0

1 N N

a ;:::;a can be generated, a = s. Contradiction.

0 0 0

1 N

Suppose that there exists another process a^ = a ;:::;a ,

N N

a = s, such that for at least one i , u s >u a . Contradic-

i i

tion. 2

This result can b e interpreted p ositively or negatively. The negative as-

p ect is that stability is not enough to select a single outcome. Seen p ositively,

Theorem 7 says that any stable pro cess will lead to an ecient result and,

conversely, that any ecient outcome can b e attained by means of a stable

communication/delib eration pro cess.

6 Conclusions

We analyzed the problem of coalition formation in games without sidepay-

ments. First, the -core solution concept was reviewed in the context of

games in which agents are p erfectly rational. We showed that a solution is in

the -core if the corresp onding utility pro le is Pareto-optimal, i.e. an indi-

vidual utility cannot b e improved without diminishing the utility of another

There is an analogy b etween this result and the folk theorems for rep eated games [9].

Both show that the outcome of a pro cess lies in a particular region of the payo s space:

ab ove the minimax p oint in the folk theorems and the Pareto-frontier here. 31

agent. This prop erty is closely related to sup eradditivity, a prop erty indi-

cating that shared optimal achievable utilities for di erent coalitions remain

optimal achievable utilities for the union of the coalitions. Sup eradditivity

implies that anytwo coalitions are b est o by merging.

Next we explored the relationships b etween axiomatic and strategic so-

lution concepts. We showed that any solution that is stable according to

the -core corresp onds to a Strong Nash equilibrium and to a Coalition-

Pro of Nash equilibrium and a Nash equilibrium. This allows us to study

games with the -core solution concept while our p ositive stability results

carry over directly to these three strategic equilibrium-based solution con-

cepts. This also allows one to con ne the search for stable -core solutions to

the space of Pareto-ecient Strong Nash equilibria or Coalition-Pro of Nash

equilibria or Nash equilibria.

For b ounded rational agents we showed that the -core solution concept

provides clues ab out the prop erties of the delib eration/communication pro-

cesses that lead to stable coalition structures. Sp eci cally,we showed that a

pro cess leads to a stable coalition structure if its outcome cannot b e blo cked

by a coalition formed in another pro cess of the same length.

Wecharacterized the communication/delib eration pro cess as a greedy

stepwise maximization of exp ected payo where delib eration and communi-

cation actions incur costs. We showed that when agents agree to a pro cess

that is in the -core, this greedy algorithm leads to convergence of the agents'

b eliefs in a nite numb er of steps. We also showed that the convergence of

b eliefs implies that the nal outcome is stable. This holds when the proto-

col length is exogenously restricted as well as when agents can endogenously

decide the length. More general mathematical conditions for such stability

were also derived, and their meaning seems inescapable: stability can only

be achieved when all the agents share their b eliefs ab out the nal outcome

of the negotiation.

Finally,we showed that the outcome of any communication/delib eration

pro cess that leads to a stable coalition structure is Pareto-optimal for the

original game that do es not incorp orate communication or delib eration. Con-

versely,anyPareto-optimal outcome can b e supp orted by a communica-

tion/delib eration pro cess that leads to a stable coalition structure. 32

References

[1] Aumann, R., Acceptable Points in General Co op erative n-Person Games,

in Tucker,A. and Luce,D. eds. Contributions to the Theory of Games IV,

Princeton University Press, Princeton 1959.

[2] Aumann, R., The Core of a Co op erative Game without Sidepayments, Trans-

actions of the American Mathematical Society 98: 539-552, 1961.

[3] Aumann, R., Sub jectivity and Correlation in Randomized Strategies, Journal

of Mathematical Economics 1: 67-96, 1974.

[4] Bernheim, B., Peleg, B., and Whinston, M., Coalition-Pro of Nash Equilib-

ria: I Concepts, Journal of Economic Theory 42: 1-12, 1987.

[5] Bo ddy, M., and Dean, T., Delib eration Scheduling for Problem Solving in

Time-Constrained Environments, Arti cial Intel ligence 67: 245-285, 1994.

[6] Clarke, E., Multipart pricing of public go o ds, Public Choice 11: 17-33, 1971.

[7] Crawford, V., and Sob el, J., Strategic Information Transmission, Economet-

rica 50: 579-594, 1982.

[8] Ephrati, E., and Rosenschein, J., Deriving Consensus in Multiagent Systems,

Arti cial Intel ligence 87: 21-74, 1996.

[9] Fudenb erg, D., and Tirole, J., Game Theory, MIT Press, Cambridge 1991.

[10] Groves, T., and Ledyard, J., Optimal allo cation of public go o ds: a solution

to the 'free rider' problem, Econometrica 45: 783-809, 1977.

[11] Kahan, J. and Rap op ort, A., Theories of Coalition Formation,Lawrence

Erlbaum Asso ciates, 1984.

[12] Kalakota, R., and Whinston, A.B., Frontiers of Electronic Commerce,

Addison-Wesley, N.Y. 1996.

[13] Ketchp el, S., Forming Coalitions in the Face of Uncertain Rewards, Proceed-

ings of the National ConferenceonArti cial Intel ligence: 414-419, 1994.

[14] Kraus, S., Wilkenfeld, J., and Zlotkin, G., Multiagent Negotiation Under

Time Constraints, Arti cial Intel ligence, 75: 297{345, 1995. 33

[15] Kreps, D., and Wilson, R., Sequential Equilibria, Econometrica 50: 863-894,

1982.

[16] Kreps, D., A Course in Microeconomic Theory, Princeton University Press,

Princeton 1990.

[17] Lipman, B., How to Decide to Decide How to Decide How to... : Mo deling

Limited Rationality, Econometrica 59: 1105-1126, 1991.

[18] Luce, R. D., and Rai a, H., Games and Decisions, John Wiley and Sons,

New York 1957.

[19] Mas-Colell, A., Whinston, M., and Green, J., Microeconomic Theory, Ox-

ford Economic Press, Oxford 1995.

[20] Mo ore, J. Implementation, Contracts, and Renegotiation in Environments

with Complete Information, in Advances in Economic Theory vol I, La ont,

J-J. ed, Cambridge University Press, Cambridge 1992.

[21] Moulin, H., Cooperative Microeconomics: a Game Theoretical Introduction,

Princeton University Press, Princeton 1995.

[22] Myerson, R., Credible Negotiation Statements and Coherent Plans, Journal

of Economic Theory 48: 264-303, 1989.

[23] Nash, J., Nonco op erative Games, Annals of Mathematics 54: 289-295, 1951.

[24] Pearl, J., Probabilistic Reasoning in Intel ligent Systems, Morgan-Kaufmann,

San Mateo CA, 1988.

[25] Peleg, B., Axiomatizations of the Core, in Aumann, R., and Hart, S. eds.

Handbook of Game Theory vol. I, North-Holland, Amsterdam 1992.

[26] Reiter, S., and Simon, C., Decentralized Pro cesses for Finding Equilibrium,

Journal of Economic Theory 56: 400-425, 1992.

[27] Rosenschein, J. S., and Zlotkin, G., Rules of Encounter, MIT Press, Cam-

bridge 1994.

[28] Rubinstein, A., Perfect Equilibria in a Bargaining Mo del, Econometrica 50:

97-109, 1982. 34

[29] Russell, S., and Wefald, E., Do the Right Thing: Studies in LimitedRatio-

nality, MIT Press, Cambridge 1991.

[30] Sandholm, T., and Lesser, V., Utility-Based Termination of Anytime Al-

gorithms, Proceedings of the ECAI Workshop on Decision Theory for DAI

Applications: 88-99, 1994.

[31] Sandholm, T., Unenforced Ecommerce Transactions, IEEE Internet Com-

puting 1: 47-54, 1997.

[32] Sandholm, T., and Lesser, V., Coalitions among Computationally Bounded

Agents, Arti cial Intel ligence, Sp ecial Issue on Economic Principles of Mul-

tiagent Systems 94: 99-137, 1997.

[33] Sandholm, T., and Lesser, V., Coalition Formation among Bounded Rational

Agents", Fourtheenth International Joint ConferenceonArti cial Intel ligence

IJCAI-95: 662-669, 1995.

[34] Sandholm, T., Larson, K., Andersson, M., Shehory, O. and Tohm e, F., Any-

time Coalition Structure Generation with Worst Case Guarantees, Proceed-

ings of the Fifteenth National ConferenceonArti cial Inteligence AAAI-98:

46-53, 1998.

[35] Sandholm, T., and Lesser, V., Advantages of a Leveled Commitment Con-

tracting Proto col, Proceedings of the Thirteenth National ConferenceonAr-

ti cial Inteligence AAAI-96: 126-133, 1996.

[36] Sandholm, T., Negotiation among Self-Interested Computational ly Limited

Agents, Ph.D. thesis, Dept. of Computer Science, University of Massachusetts,

Amherst 1996.

[37] Shehory, O., and Kraus, S., Task Allo cation via Coalition Formation among

Autonomous Agents, Fourtheenth International Joint ConferenceonArti -

cial Intel ligence IJCAI-95: 655-661, 1995.

[38] Shehory, O., and Kraus, S., Metho ds for Task Allo cation via Agent Coalition

Formation, Arti cial Intel ligence 101: 165-200, 1998.

[39] Shehory, O., and Kraus, S., A Kernel-oriented Mo del for Coalition-formation

in General Environments: Implementation and Results, Proceedings of the

Thirteenth National ConferenceonArti cial Inteligence AAAI-96: 134-

140, 1996. 35

[40] Shubik, M., Game Theory Mo dels and Metho ds in Political Economics, in

Arrow, K., and Intriligator, M. eds. Handbook of Mathematical Economics

vol.I, North-Holland, Amsterdam 1981.

[41] Tohm e, F., and Sandholm, T., Coalition Formation Pro cesses with Belief

Revision among Bounded Rational Self-Interested Agents, Fifteenth Interna-

tional Joint ConferenceonArti cial Intel ligence IJCAI-97, Workshop on

Social Interaction and Communityware: 43-51, Nagoya, Japan, August 25,

1997.

[42] Zilb erstein, S., and Russell, S., Optimal Comp osition of Real-Time Systems,

Arti cial Intel ligence 82: 181-213, 1996.

[43] Zlotkin, G., and Rosenschein, J., Coalition, Cryptography and Stability:

Mechanisms for Coalition Formation in Task Oriented Domains, Proceedings

of the Twelfth National ConferenceonArti cial Inteligence AAAI-94: 432-

437, 1994. 36